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Intel Corporation is a leading supplier of microcomputer components, 
modules and systems. When Intel first introduced the microprocessor in 1971, 
it created the era of the microcomputer. Today, Intel architectures are considered 
world standards. Intel products are used in a wide variety of applications including, 
embedded systems such as automobiles, avionics systems and telecommunications 
equipment, and as the CPU in personal computers, network servers and 
supercomputers. Others bring enhanced capabilities to systems and networks. 
Intel's mission is to deliver quality products through leading-edge technology. 
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INTEL SERVICE 

INTEL’S COMPLETE SUPPORT SOLUTION WORLDWIDE 

Intel Service is a complete support program that provides Intel customers with hardware support, software 
support, customer training, and consulting services. For detailed information contact your local sales offices. 

Service and support are major factors in determining the success of a product or program. For Intel this 
support includes an international service organization and a breadth of service programs to meet a variety of 
customer needs. As you might expect, Intel service is extensive. It can start with On-Site Installation and 
Maintenance for Intel and non-Intel systems and peripherals. Repair Services for Intel OEM Modules and 
Platforms, Network Operating System support for Novell NetWare and Banyan VINES software. Custom 
Integration Services for Intel Platforms, Customer Training, and System Engineering Consulting Services. Intel 
maintains service locations worldwide. So wherever you’re using Intel technology, our professional staff is 
within close reach. 

ON-SITE INSTALLATION AND MAINTENANCE 

Intel’s installation and maintenance services are designed to get Intel and Intel-based systems and the net- 
works they use up and running — fast. Intel’s service centers are staffed by trained and certified Customer 
Engineers throughout the world. Once installed, Intel is dedicated to keeping them running at maximum 
efficiency, while controlling costs. 

REPAIR SERVICES FOR INTEL OEM MODULES AND PLATFORMS 

Intel offers customers of its OEM Modules and Platforms a comprehensive set of repair services that reduce 
the costs of system warranty, maintenance, and ownership. Repair services include module or system testing 
and repair, module exchange, and spare part sales. 

NETWORK OPERATING SYSTEM SUPPORT 

An Intel software support contract for Novell NetWare or Banyan VINES software means unlimited access to 
troubleshooting expertise any time during contract hours — up to seven days per week, twenty-four hours per 
day. To keep networks current and compatible with the latest software versions, support services include access 
to minor releases and “patches” as made available by Novell and Banyan. 

CUSTOM SYSTEM INTEGRATION SERVICES 

Intel Custom System Integration Services enable resellers to order completely integrated systems assembled 
from a list of InteBSh'" and Intel486''' microcomputers and validated hardware and software options. These 
services are designed to complement the reseller’s own integration capabilities. Resellers can increase business 
opportunities, while controlling overhead and support costs. 

CUSTOMER TRAINING 

Intel offers a wide range of instructional programs covering various aspects of system design and implementa- 
tion. In just three to five days a limited number of individuals learn more in a single workshop than in weeks of 
self-study. Covering a wide variety of topics, Intel’s major course categories include: architecture and assembly 
language, programming and operating systems, BITBUS’'', and LAN applications. 

SYSTEM ENGINEERING CONSULTING 

Intel provides field system engineering consulting services for any phase of your development or application 
effort. You can use our system engineers in a variety of ways ranging from assistance in using a new product, 
developing an application, personalizing training and customizing an Intel product to providing technical and 
management consulting. Working together, we can help you get a successful product to market in the least 
possible time. 
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Intel uses various data sheet markings to designate each phase of the document as it 
relates to the product. The marking appears in the upper, right-hand corner of the data 
sheet. The following is the definition of these markings: 


Data Sheet Marking 

Product Preview 


Advanced Information 


Preliminary 


No Marking 


Description 

Contains information on products in the design phase of 
development. Do not finalize a design with this 
information. Revised information will be published when 
the product becomes available. 

Contains information on products being sampled or in 
the initial production phase of development.* 

Contains preliminary information on new products in 
production.* 

Contains information on products in full production.* 


^Specifications within these data sheets are subject to change without notice. Verify with your local Intel sales 
office that you have the latest data sheet before finalizing a design. 
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82750DB 

DISPLAY PROCESSOR 


Programmable Video Timing 

— 28 MHz and 45MHz Operating Frequency 

— Pixel/Line Address Range to 4096 

— Fully Programmable Sync, 

Equalization, and Serration 
Components 

— Fully Programmable Blanking and 
Active Display Start and Stop Times 

— Genlocking Capability 

Flexible Display Characteristics 

— 8-, Pseudo 16-, 16-, and 32-Bit/Pixel 
Modes 

— Selectable Pixel Widths of 1.0, 1.5, 

2.0, 2.5, through 14 Periods of the 
Input Frequency 

— Support Popular Display Resolutions: 
VGA, XGA, NTSC, PAL, and SECAM 

— On-Chip Triple DAC for Analog RGB/ 
YUV Output 


— Mix Graphics and Video images on a 
Pixel by Pixel Basis 

— Real Time Expansion of the Reduced 
Sample Density Video Color 
Components (U, V) to Full Resolution 

— Three Independently Addressable 
Color Palettes 

— Programmable 2X Horizontal 
Interpolation of Y Channel 

— 16 X 16 X 2-Blt Cursor Map with 
Independently Programmable 2X 
Expansion Factors in X and Y 
Dimensions 

— YUV to RGB Color Space Conversion 

— 2X Vertical Replication of Y, U, and V 
Data for Displaying Full Motion Video 
on VGA Monitor 

— Register and Function Compatible 
with the 82750DA 


Intel’s 82750DB is a custom designed VLSI chip used for processing and displaying video graphic information. 
It is register and function compatible with the 82750DA. 

Reset inputs allow the 82750DB to be genlocked to an external sync source. By programming internal control 
registers, this sync can be modified to accommodate a wide variety of scanning frequencies. A large selection 
of bits/pixel, pixels/line, and pixel widths are programmable, allowing a wide latitude in trading-off image 
quality vs update rate and VRAM requirements. 

The 82750DB can operate in a digitizing mode, wherein it generates timing and control signals to the 82750PB 
and VRAM, but does not output display information. Besides digitizer support signals and video synchroniza- 
tion, the 82750DB outputs digital and analog RGB or YUV information and an 8-bit digital word of alpha data. 
This alpha channel data may be used to obtain a fractionat mix of 82750DB outputs with another video source. 
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Figure 1-1. 82750DB Pinout 
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Table 1-1. Pin Cross Reference by Pin Name 


Pin Name 

Location 

ACTDIS 

87 

ALPHA[7] 

88 

ALPHA [6] 

90 

ALPHA[5] 

92 

ALPHA[4l 

93 

ALPHAOl 

95 

ALPHA [2] 

96 

ALPHAh] 

97 

ALPHA [0] 

102 

AVCC 

128 

AVSS 

125 

BG 

69 

BPP[1] 

85 

BPP[0] 

86 
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72 

DATAIN[31] 
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53 
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DATAlN[23] 
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DATA1N[20] 
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Pin Name 

Location 
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DBU[7] 
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DBU[1] 

112 

DBU[0] 

113 

DGY[7] 

8 

DGY[6] 

9 

DGY[5] 

10 

DGY[4] 

11 

DGY[3] 

12 

DGY[2] 

13 

DGYd] 

14 

DGY[0] 

15 

DISDAC 

66 

DISDIG 

84 


Pin Name 

Location 

DRV [7] 

114 

DRV[6] 

118 

DRV[5] 

119 

DRV[4] 

3 

DRV[3] 

4 

DRV[2] 

5 

DRVd] 

6 

DRV[0] 

7 

FCO 

61 

FREQIN 

64 

GY 

129 

HRESET# 

60 

HYSNC 

71 

IREFIN 

130 

PIXCLK 

120 

RESETB# 

73 

RV 

126 

SCLK[1] 

77 

SCLK[0] 

74 

TEST# 

63 

TESTACT# 

62 

VBUS[3] 

81 

VBUS[2] 

80 

VBUS[1] 

79 

VBUS[0] 

78 

Vcc 

2 

Vcc 

33 

Vcc 

35 

Vcc 

45 

Vcc 

51 

Vcc 

65 

Vcc 

67 

Vcc 

75 


Pin Name 

Location 

Vcc 

82 

Vcc 

91 

Vcc 

98 

Vcc 

100 

Vcc 

104 

Vcc 

109 

Vcc 

116 

Vcc 

123 

Vcc 

127 

Vcc 

132 

VGCS 

121 

VRESET# 
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Vss 

16 

Vss 
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124 

Vss 
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VSYNC 

70 
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Table 1-2. Pin Cross Reference by Location 


Location 

Pin Name 

1 

Vss 

2 

Vcc 

3 

DRV[4] 

4 

DRV[3] 

5 

DRV[2] 

6 

DRV[1I 

7 

DRV[01 
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DGY[7l 
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DGY[6] 

10 

DGY[5] 
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DGY[3] 
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21 

DATAIN[3] 
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DATAIN[10l 
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DATAINI12] 
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DATAIN[13] 
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33 

Vcc 


Location 

Pin Name 

34 

Vss 

35 
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DATAIN[14] 

37 

DATAIN[15] 
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54 

DATAIN[28] 

55 

DATAIN[29] 

56 

DATAIN[30] 

57 

Vss 

58 

DATA1N[31] 

59 

VRESET# 

60 

HRESET# 

61 

FCO 

62 

TESTACT# 

63 

TEST# 

64 

FREQIN 

65 

Vcc 

66 

DISDAC 


Location 

Pin Name 

67 

Vcc 

68 

Vss 

69 

BG 

70 

VSYNC 

71 

HSYNC 

72 

CSYNC 

73 

RESETS# 

74 

SCLK[0] 

75 

Vcc 

76 

Vss 

77 

SCLK[1] 

78 

VBUS[0] 

79 

VBUS[1] 

80 

VBUS[2] 

81 

VBUS[3] 

82 

Vcc 

83 

CB 

84 

DISDIG 

85 

BPP[1] 

86 

BPP[0] 

87 

ACTDIS 

88 

ALPHA[7l 

89 

Vss 

90 

ALPHA[6] 

91 

Vcc 

92 * 

ALPHA[5] 

93 

ALPHA[4] 

94 

Vss 

95 

ALPHA[3] 

96 

ALPHA[2] 

97 

ALPHA[1] 

98 

Vcc 

99 

Vss 


Location 

Pin Name 

100 

Vcc 

101 

Vss 

102 

ALPHAlOl 

103 

DBU[7] 

104 

Vcc 

105 

DBU[6l 

106 

DBU[5] 

107 

DBU[4] 

108 

Vss 

109 

Vcc 

110 

DBU[3] 

111 

DBU[2] 

112 

DBU[1] 

113 

DBU[0] 

114 

DRV[7] 

115 

Vss 

116 

Vcc 

117 

Vss 

118 

DRV [6] 

119 

DRVI5] 

120 

PIXCLK 

121 

VGCS 

122 

BU 

123 

Vcc 

124 

Vss 

125 

AVss 

126 

RV 

127 

Vcc 

128 

AVcc 

129 

GY 

130 

IREFIN 

131 

Vss 

132 

Vcc 
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Figure 1-2. 82750DB Functional Signal Groupings 
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Table 1-3. Pin Descriptions 


Symbol 

Type 

Name and Function 

FREQIN 

' 1 

FREQUENCY INPUT CLOCK: In normal use, the 82750DB supplies refresh 
timing for an associated VRAM through the 82750PB. This places a lower limit 
on the line frequency, which is a programmed multiple of FREQIN. It must 
generate enough refresh cycles, so a minimum line rate of 4 kHz is required. 
Furthermore, the 82750PB may run no less than Vs the speed of the 82750DB, 
since the 82750PB samples the timing and control signals generated by the 
82750DB. The period of FREQIN is known as a “T” cycle. 

RESETS# 

1 

EXTERNAL RESET: Input signal which places all units in the 82750DB Into an 
initialized state, and sets the transfer rate to a default value of 1/OX the 
operating frequency. It is an edge sensitive iniput which must be held low for a 
minimum of ten T-cycles. The slowest transfer rate is selected to ensure that 
the 82750DB will read the register information correctly during the first register 
transfer, independent of the speed of the VRAMs. During the reset state, the 
analog video outputs and digital outputs are set to the black level. This will 
occur a maximum of four cycles after RESETB# is set to a zero. This signal is 
also used in conjunction with the TESTACT # input to disable outputs. 

VRESET# 

1 

VERTICAL RESET: By programming a bit in an Internal register, the 82750DB 
may be placed in the Genlock mode. If this mode is selected, assertion of 

VRESET # resets all vertical timing to the first line of the next field. It does not 
affect the horizontal timing, but does generate the on-chip end of field signals. It 
is an edge sensitive input that is sampled in the 82750DB at the internal time 
corresponding to the rising edge of FREQIN. If the Genlock mode has not been 
enabled, this signal will have no effect on the sync timing. The 82750DB will 
then operate In a free-running mode. Refer to Chapter 3 for a detailed 
description of genlocking the 82750DB. 

HRESET# 

1 

HORIZONTAL RESET: When in the Genlock mode, this input will reset all of 
the horizontal timing to the start of the line (beginning of horizontal sync). 

HRESET # does not affect vertical timing (except for an up-to one-line delay) or 
any other 82750DB registers. This signal is an edge sensitive Input that Is 
sampled in the 82750DB at that internal time corresponding to the rising edge 
of FREQIN. As was the case with the VRESET # signal, this input will be 
ignored when not in the Genlock mode. 

VBUS[3:0] 

0 

VDP COMMUNICATION BUS: The 82750DB outputs status and VRAM transfer 
requests over these lines to the 82750PB, for 2 to 16 T-cycles (as programmed 
by the user). Transfer requests can tie up the 82750DB/VRAM, 82750PB/ 

VRAM, or 82750PB/82750DB (VBUS) interfaces for a longer period due to 

VRAM arbitration. When signals are not being sent out, the VBUS has value 

1 1 11 , the “null command.” 

SCLK[1;0] 

0 

VRAM SHIFT CLOCKS: Transfer requests to the 82750PB cause a VRAM 
address to be set up, and the VRAM serial registers loaded (in the case of 
displaying) or unloaded (in the case of digitizing). These signals are used to shift 
data out of and into the VRAMs. Both signals are identical, and run at a 
maximum rate of IX of the pixel frequency, except during transfer requests, at 
which time they run at IX, 1 /2X, or 1 /3X of the operating frequency of the 
82750DB, as programmed by the user. 

DATA1N[31;0] 

1 

DATA INPUT BUS: This is the input data clocked in from VRAM by the 

SCLK[1 :0] signals. The format of the input data is a function of the programmed 
number of bits/pixel and of the type of transfer cycle being executed. Data will 
be sampled internally on the rising edge of FREQIN. 
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Table 1-3. Pin Descriptions (Continued) 


Symbol Type 

FCO O 


HSYNC 


VSYNC 


CSYNC 


CB 


BG 


PIXCLK 


O 


GY, RV, BU 


O 


Name and Function 

FRAME CAPTURE ON: This is the output signal which indicates to the digitizer 
that the VRAM serial port has been turned from read mode to write mode. The 
digitizer may then drive the (common) VRAM serial register data I/O pins. FCO 
will be asserted after the programmer specifies digitization, five lines after the 
start of the active vertical display, at the time of HSYNC. This gives the external 
logic time to switch directions of the VRAM serial data bus. This signal will end 
four lines after vertical active stops, at the next HSYNC, to make sure the digitizer 
is off before the next beginning-of-field register transfer. 

HORIZONTAL SYNCHRONIZATION: Video synchronization signal which is 
asserted at the beginning of every line and ends a programmed time later. (The 
duration of this signal is specified in T-cycles.) 

VERTICAL SYNCHRONIZATION: Video synchronization signal which can be 
programmed to start (once) and end (once) in every field. (The start and stop 
position may be specified in half-line units.) 

COMPOSITE SYNCHRONIZATION PULSE: This contains the programmed 
vertical serration and equalization information, as well as horizonal 
synchronization pulses. 

COMPOSITE BLANKING: This signal can be programmed to end once and start 
once In each line, and end once and start once every field. 

BURST GATE: This signal starts and stops at user-programmable horizontal 
positions in each line. In a programmable vertical group of lines. The primary use 
of this signal is to provide a “window” during which the BURST output should be 
inserted to generate a baseband NTSC signal. The output frequency is set by an 
integer divisor (0-31) and the rate of the FREQIN clock input. To use this 
effectively, the 82750DB must operate at an integer multiple of the NTSC 3.58 
MHz color subcarrier. The number is programmed in two’s complement form in 
the General Control register. 

PIXEL CLOCK: This output signals valid data on the DGY, DRV, DBU, GY, RV, 
and BU lines. PIXCLK becomes active one-half of a T-cycle after valid data 
appears on DGY, DRV, or DBU, and coincident with GY, RV, and BU. During 
active display time it is issued at a steady rate of 1 /(T-cycles/pixel) times per T- 
cycle, and otherwise at a steady rate of once per T-cycle. Its duration is one-half 
of a T-cycle, and its rising edge may synchronize with either rising or falling edges 
of FREQIN depending on the pixel frequency. This signal may be used to 
synchronize off-chip processing of the pixel data outputs. 

ANALOG PIXEL OUTPUTS: These signals are the processed pixel data from the 
82750DB in analog form. During the display, these signals may be programmed to 
output pixel data in either YUV or RGB format. 


Output 

Format 

DGY 

DRV 

DBU 

YUV 

Y 

V 

U 

RGB 

, G 

R 

B 


DGY[7:0], O 

DRV[7:0], 

DBU [7:0] 

ALPHA[7:0] O 


ACTDIS 


O 


DIGITAL VIDEO OUTPUTS: These are the digital outputs of the GY, RV, and BU 
channels, respectively. They are valid with respect to the rising edge of PIXCLK. 


ALPHA CHANNEL: These 8 bits are used to output a digital value for mixing the 
82750DB output with another video signal off-chip. The alpha channel information 
may be included in the pixel data, or may be output based on a comparison of the 
pixel data with user-programmed values. 

ACTIVE DISPLAY: This is the active portion of the display as programmed by the 
user. It is delayed by the pipeline through the 82750DB, which is 5 lines vertically 
and a variable number horizontally, depending on the display mode. 
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Table 1-3. Pin Descriptions (Continued) 


Symbol 


Type 


Name and Function 


BPP[1:0] 


O 


BITS PER PIXEL: During the nonactive display, the user programmed bits/pixel is 
encoded on these lines. During active display, the BPP[0] signal is multiplexed 
with a signal, Cursor Active, which indicates if the cursor data is currently active 
(non-transparent). When the Cursor Active output signal is asserted, this Indicates 
that cursor overlay data is currently being output. Also during the active display, 
the BPP[1] signal is multiplexed with a signal, VUGR, which indicates whether the 
82750DB is operating in a graphics or video mode. When the VUGR output signal 
is asserted, this Indicates the G, R, and B outputs are derived from the 
subsampled VU data. These pins allow users to latch the BPP[1:0] signals during 
nonactive display time (as indicated by ACTDIS being zero) for post-processing of 
the 82750DB output. The active cursor window on BPP[0] can be used during 
active display, to multiplex in other video streams Into the output display. The 
following table illustrates the encoding on the BPP signals. 


Bits/Pixel 

ACTDIS 

BPP[0] 

BPP[1] 

8 

0 

0 

0 

16 

0 

0 

1 

32 

0 

1 

0 

pseudo 16 

0 

,1 

1 

8 

1 

Cursor Active 

VUGR 

16 

1 

Cursor Active 

VUGR 

32 

1 

Cursor Active 

VUGR 

pseudo 16 

1 

Cursor Active 

VUGR 


DISDAC 


DISDIG I 


TESTACT# 


DISABLE ANALOG OUTPUTS: When this input is active, the Analog Pixel 
Outputs are set to a high-impedance state. 

DISABLE DIGITAL OUTPUTS: When this input is active, the digital outputs of the 
82750DB will be set to zero. In applications that use only the analog outputs of the 
82750DB, the digital outputs must be disabled. 

TEST ACTIVE: Active low signal that is used in conjunction with the RESETS# 
signal to allow the chip to perform one of the following functions: 


RESETB# 

TESTACT# 

82750DB State 

0 

1 

Enter Reset State 

0 

0 

Enter Reset State 



TrIstate All Outputs 



Analog Outputs are Zero 

1 

1 

Normal Operation 

1 

0 

Reserved 


TEST# 

VGCS 

IREFIN 


AVcc 

AVss 


0 _ 

I 


1 ^ 

I 


TEST INPUT: This signal must be set to VCC to guarantee correct chip operation. 

INTERNAL VOLTAGE REFERENCE: This signal must be decoupled to AVCC. 

ANALOG CURRENT REFERENCE: Under normal operation, this signal should be 
tied to a temperature compensated current reference to AVSS. This signal must 
be decoupled to AVCC. 

ANALOG POWER pin provides + 5 Vqc supply to the Digital to Analog Converter. 

ANALOG GROUND pin provides the OV connection to which the analog outputs 
are referenced. This must be connected to VSS. 


Vcc 

Vss 


POWER pins provide + 5 Vpc supply input. 

GROUND pins provide the OV connection to which all inputs and outputs are 
referenced. 
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Table 1-4. Input Pins 


Name 

Active 

Level 

Synchronous/ 

Asynchronous 

FREQIN 

HIGH 

Synchronous 

RESETB# 

LOW 

Asynchronous 

VRESET# 

LOW 

Asynchronous 

HRESET# 

LOW 

Asynchronous 

DISDIG 

HIGH 

Asynchronous 

TESTACT# 

LOW 

Asynchronous 

TEST# 

LOW 

Asynchronous 


All output pins have an active level of HIGH, and are 
floated when RESETS# and TESTACT# are set to 
a zero. The exceptions are GY, RV, and BU which 
will be forced to a zero level. 


2.0 ARCHITECTURE 


Overview 

There are 10 units in the 82750DB. Each of the units 
operates independently at the maximum clock rate 
Input to the chip. The control information for each 
block is distributed in programmable registers 
throughout the chip. These registers are loaded on 
user-specified lines during the horizontal and vertical 
blanking intervals of the field. The register data that 
was read In from VRAM Is passed from block to 
block during the blanking intervals of the display, on 
the same lines that the pixel information is passed 
during the active display. The Functional Block Dia- 
gram is shown in Figure 2-1. 

In order to maximize speed and compensate for pro- 
cessing delays, the chip Is heavily pipelined. All in- 
ter-block information is delay-equalized to accom- 
modate the different pipeline lengths In each mod- 
ule. As a result, the total pipeline delay is dependent 
on the number of processing units that are used to 
generate the display. Chapter 4 describes how the 
user programming is affected by these pipeline de- 
lays. 

Each of the units are described in more detail in the 
following sections of this chapter. 


Sync Generation and Timing 

The sync generation and timing block generates all 
of the internal timing and control signals, as well 


as the video synchronization signals. Sync and tim- 
ing information may be derived from two sources: 
from the master clock, in which case the control reg- 
isters on the 82750DB are. programmed to provide 
the desired display frequency In terms of periods of 
the master clock (T-cycles), or from the horizon- 
tal and vertical external reset signals. (The latter 
is known as the genlock mode.) Characteristics 
such as line rate, blanking and border intervals, and 
composite synchronization parameters can be in- 
dependently set. Since the 82750DB can be 
reprogrammed once each line, horizontal strips of 
different resolutions can be supported on the same 
display. However, the horizontal strips that can be 
supported are limited by the host processor’s re- 
sponse to redefining the bitmap pointers resident on 
the 82750PB. 

The horizontal and vertical display parameters are 
fully programmable. Figure 2-2 illustrates the hori- 
zontal programming parameters. The line starts at 
the programmed start position, with the length of 
half of a line programmed In T-cycles. The length of 
the total line Is twice the half-line length. Parameters 
such as horizontal sync start, horizontal sync width, 
horizontal blanking start and stop, and horizontal ac- 
tive start and stop are all specified by the user. Note 
that the border time is not explicitly programmed, but 
Is defined as the region of the display line where 
neither active display nor blanking is programmed to 
occur. In order for the 82750DB to function correctly, 
the width of the horizontal active display should be 
programmed such that the end of the horizontal ac- 
tive display coincides with the end of the last dis- 
played pixel. 

Figure 2-3 shows the vertical programming parame- 
ters. The basic unit for vertical programming is in 
units of half lines, with the half-line count for each 
field starting at zero. Where appropriate for a param- 
eter, the count is programmed in units of full lines. 
The length of the complete field is programmed in 
half lines, which makes it convenient for distinguish- 
ing between interlaced and non-interlaced displays. 
(For interlaced displays, the number of half lines is 
odd, for non-interlaced displays, it is even.) The ver- 
tical active and blanking regions may be indepen- 
dently programmed, with the border time defined as 
the region where blanking and active display is not 
on. 

NOTE: 


Sync parameters are completely independent of 
the display parameters. This allows the sync sig- 
nals to be positioned anywhere in the field (even 
during active display). 
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Figure 2-1. 82750DB Unit Level Diagram 
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Start of Horizontal Sync 



• Ail horizontal programming parameters are in periods of the master clock. 

• Border may be eliminated by programming the blanking time to abut the active display. 

• Pixel widths must be an integer divisor of horizontal active width. 



Border Time 


□ Active Display 


240855-5 


Figure 2-2. Horizontal Programming Parameters 
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VBUS Control 

The VBUS controller sends all 82750DB requests 
for display bitmaps, VRAM refresh, and synchroniza- 
tion information to the 82750PB, at programmable 
times during a field. Transfer requests are scheduled 
to occur on a line basis, so only their vertical position 
(or line) is specified by the user. Other commands, 
like refresh requests, occur every line, and their hori- 
zontal position (or dot position) in the line must be 
specified by the user. Transfer requests are given 
the highest priority by the VBUS control circuit and 
are performed first during a blanking interval. The 
programmer has the responsibility of scheduling the 
line oriented codes, like refresh, so that they do not 
collide with the transfer requests. 

Besides arbitrating the scheduled transfer requests, 
the VBUS controller also reads the data from the 
VRAM shift registers using the two shift clock out- 
puts (SCLK[1:0]). The code corresponding to the 
type of data to be read is asserted for a programma- 
ble number of cycles on the 4-bit VBUS. The 
82750DB then waits a programmable delay before 
reading the data from the VRAM. This delay should 
be long enough to guarantee that the 82750PB has 
completed loading the information Into the serial 
shift register of the VRAM. Both signals are off while 
the code causing the transfer cycle Is active on the 
VBUS, as well as during the read delay time. Figure 
2-4 illustrates this communication between the 
82750PB and the 82750DB. 


When the delay wait is over, the shift clock outputs 
are activated. The SCLK[1:0] signals’ behavior is 
dependent on the transfer rate that the user has se- 
lected — either 1X, 1/2X, or 1/3X the operating fre- 
quency. Note that if the RESETS# signal Is applied, 
the transfer rate is automatically set to 1 /3X during 
the first automatic register transfer, regardless of the 
state of the transfer rate selection. The transfer rate 
may be changed in the first register transfer after 
RESETS# Is set to a logic one value. 

Figure 2-5 illustrates how the SCLKs operate in the 
1X mode in a system. SCLK[1:0] signals will toggle 
between zero and one on the rising edge of 
FREQIN, after an Internal logic delay. The data is 
read into the 82750DB on the rising edge of the in- 
ternal clock, one 82750DB clock cycle after the 
SCLK outputs are asserted. Since there are 32 data 
input pins, each SCLK can read in the serial data 
from eight 256 x 4 VRAM memory devices. Adding 
external buffering to the SCLKs (to drive more mem- 
ory) will also add delay to the memory access. The 
delay increase may require more than one T-cycle 
before the VRAM data is valid. In this case, the time 
between the rising edge of the Internal 82750DB 
clock that generates the SCLKs and the edge that 
latches the data must be increased. 

There are two solutions, the operating frequency of 
82750DB can be lowered to accommodate a longer 
\T-cycle, or the 1/2X SCLK mode may be selected 
(as shown in Figure 2-6). When using the 1/2X 
transfer rate, the data is read into the 82750DB on 
the rising edge of the internal clock, two 82750DB 
clock cycles after the SCLK outputs are asserted. 


Programmable 82750DB Delay 

^ (2-255 flgSQOBCIwKCvcHri . 


82750DB 

FREQIN 


^WVV\AA^VWWV\A 


->•1 

VBUS[3:0] 


Programmable 82750DB VBUS Code Length 
^ ^(2 - 15 827S0DB Clock Cycles) ^ , 


SCLK[1:0] 


• «^*dVBUS 


Valid Transfer Code 


82750DB Samples Data 




VRAM 

Output 




The 827S0DB initiates The 82750PB must have The 82750PB must have executed 

transfer request finished decoding the the 82750DB transfer request 

VBUS code. (DATA should be In the serial 

shift register of the VRAM.) 

240855-7 


Figure 2-4. 82750PB/82750DB Communication 
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Figure 2-7 illustrates 1 /3X (default) shift clock oper- 
ation that is used during the RESET mode or may be 
programmed by the user. The first word of data is 
latched by the 82750DB on the rising ede of the 
FREQIN that is three T-cycles after the SCLK out- 
puts were asserted. This allows three full 82750DB 


cycles for the VRAMs to output valid data, which 
gives extra margin for applications that need longer 
shift read cycles (due to slower memories or exter- 
nal logic delays) and do not wish to operate the 
82750DB at a slower speed. 
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Figure 2-6. 82750DB 1/2X Shift Clock Operation 
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When reading data from memory during active dis- 
play, the SCLK[1:0] outputs operate at a rate re- 
quired to support the programmed display rate. This 
rate is determined from the following equation; 

^ (# of bits/pixel) 

(32-bit/word) * (# word/fetch) * (#T-cycle/pixel) 

where: # bits/pixel and # T-cycles/pixel are user- 

programmed 

# word/fetch is: 1 

The SCLK[1:0] outputs will be the same frequency 
as the Input clock in the 1X shift clock mode, and 
one half the input clock frequency when using the 
1 /2X mode. The frequency will be one third in the 
input clock when using the 1/3X mode. In the 1/3X 
mode the SCLK[1:0] outputs, will be high for one 
T-cycle, and low for 2 T-cycles. 

VBUS CODE DESCRIPTION 

When the 82750DB is actively fetching and display- 
ing pixels, VUXFER, BMX/YBMNPX, and REGX are 
typically sent over the VBUS. Of the three codes, 
REGX has top priority, followed by VUXFER, and 
last by BMX/YBMNPX. These commands may be 
programmed to occur each active line during the 
blanking interval for the line just completed. If a reg- 
ister transfer has been programmed for an active 
line, it takes priority and is executed first. Otherwise, 
immediately after the register transfer, any sched- 
uled VUXFER and BMX/YBMNPX commands are 
executed. The programmer has the responsibility for 
verifying that the sum of times required by these 
cornmands does not exceed horizontal non-active 
display time. The 82750DB will commence fetching 
pixels at the subsequent start of active display. A 
detailed explanation of the different types of VBUS 
commands and their corresponding codes follows. 

Transfer Requests 

The following commands request the 82750PB to 
transfer information from the VRAM array into the 
VRAM shift register. When multiple requests are pro- 
grammed for a given line, they are listed in the priori- 
ty they are sent. When asserting a transfer request, 
the programmer must be aware of two other pro- 
grammed parameters, VBLEN and SCLK delay. 

The VBLEN parameter is a user programmed value 
whose bits lie in the General Control Register. It is 
the length of time, in 82750DB T-cycles, that a par- 
ticular VBUS code will be held at the outputs. It is 
used to ensure that the asynchronously operating 
82750PB chip will have enough time to recognize 
and begin operating on an 82750DB transfer re- 
quest. 


The other parameter the programmer needs to set Is 
the SCLK delay. This can be found in the Pixel Con- 
trol Register. It is the number of 82750DB clock cy- 
cles that the DB will wait before clocking in data, out 
of the VRAM, after the initiation of a transfer request 
on the VBUS outputs. 

REGX (0010) This command requests that the 
82750PB transfer 82750DB register information into 
the VRAM shift registers. Besides the automatic 
82750DB register transfer that occurs on the second 
line (line 2) of each field, the programmer can speci- 
fy the next horizontal line on which another register 
transfer is to take place. The transfers may be 
scheduled many times during the field. On the first 
transfer, the 82750PB uses the contents of its 
82750DBC register as the starting address of the 
82750DB register data. On each subsequent ac- 
cess, the programmed pitch value in 82750PB’s 
82750DBC-PITCH register is added to the accumu- 
lated start address. The programmer must ensure 
that the data is stored in VRAM at the correct ad- 
dress. Since the pitch remains constant, the longest 
register load will determine the pitch value. 

The VBUS unit performs a vertical checksum on all 
the register information. Each bit in the register word 
undergoes an exclusive-OR with the corresponding 
bit in the previous data word. The 82750DB com- 
pares this information with the user generated 
checksum, which is the last 32-bit data word read 
into the 82750DB during a register transfer. If the 
values do not match, the 82750DB will disable all of 
its digital sync and data outputs, enter the reset 
state, and send a SHUTDOWN code (82750DBSD) 
to the 82750PB over the VBUS [3:0] outputs. If the 
new checksum is correct, the new register values 
will take effect immediately. 

VUXFER (0001) This code is used to request VU 
data, providing new VU data Is required by the 
82750DB. This command is issued only on vertically 
active lines (as programmed in the register, not as 
seen on the screen) and possibly the four lines after. 
On each line, a row of V and/or U samples are load- 
ed into the VU interpolator line stores. The pattern of 
requests depends upon the mode in which the VU 
interpolator is operating. In the interlaced VU mode, 
one line of samples for both the V and U compo- 
nents are fetched during each transfer; in the non-in- 
terlaced VU mode, only one line of samples for ei- 
ther the V or U components is fetched. Table 2-1 
illustrates the pattern of requests. M is the pro- 
grammed first vertical active line, and N the last ac- 
tive line. The modes listed have VU transfer re- 
quests following the end of horizontal active of the 
lines specified, stopping with the last line, N + 4. 
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Table 2-1. VU Transfer Request Patterns 


Mode 

Active 

Line 

Request VU Data 

2x Non-Interlaced 

M 

Fetch 1 st Line of V 


M + 1 

Fetch 1st Line of U 


M + 4 

Fetch 2nd Line of V 


M + 5 

Fetch 2nd Line of U 


N + 4 

Fetch Last Line of V 

2x Interlaced 

M 

Fetch 1 St Line of V and U 

(Odd and Even 

M + 4 

Fetch 2nd Line of V and U 

Fields) 

M + 5 

Fetch 3rd Line of V and U 


N + 4 

Fetch Last Line of V and U 

4x Non-Interlaced 

M 

Fetch 1 St Line of V 


M + 1 

Fetch 1st Line of U 


M + 4 

Fetch 2nd Line of V 


M + 5 

Fetch 2nd Line of U 


M + 8 

Fetch 3rd Line of V 


N + 4 

Fetch Last Line of V 

4x Interlaced 

M 

Fetch 1 St Line of V and U 

(Odd and Even 

M + 4 

Fetch 2nd Line of V and U 

Fields) 

M + 6 

Fetch 3rd Line of V and U 


N + 4 

Fetch Last Line of V and U 


The 82750PB uses another internal pointer to cause 
the VRAM to load the desired VU data into its shift 
registers (incrennenting the pointer by a pitch value). 
This command is asserted for a programmable num- 
ber of T-cycles (m), as specified in the Miscellane- 
ous Control register. Then, the 82750DB fetches 
them, tying up the 82750DB/VRAM interface for 
(n + 2) cycles, where n is the programmable total 
number of 8-bit samples of V and U fetched. Note 
that one extra word, which may overlap the next 
VBUS command, is fetched. 

By setting a bit in the Miscellaneous Control register, 
it is possible to replicate lines of V and U generated 
by the interpolator for the entire field. Since each 
line of VU data is displayed twice, the rate that the 
VU sample map has to be fetched from VRAM is 
reduced by y^.. Table 2-2 lists the sequence of VU 
loads. 

In some cases, the VU interpolator may cover only a 
portion of the display. In those instances, M in the 
above examples would be the first line that VU inter- 
polation is enabled. N would be the last line that VU 
interpolation is enabled. Regardless of the state of 
the Line Replicate bit, there would be no vertical 
pipeline delay between the loading of the first line of 
samples and the second line of samples. The first 
line of samples would be loaded at M-1, and the 
second line at M. This reduces the delay between 
switching interpolation modes during a single dis- 
play. 


Table 2-2. VU Transfer Request Patterns 
with Line Replicate 


Mode 

Active 

Line 

Request 

2x Non-Interlaced 

M 

Fetch 1st Line of V 


M + 1 

Fetch 1st Line of U 


M + 4 

Fetch 2nd Line of V 


M + 5 

Fetch 2nd Line of U 


M + 8 

Fetch 3rd Line of V 


M + 9 

Fetch 3rd Line of U 


N + 4 

Fetch Last Line of V 

2x Interlaced 

M 

Fetch 1 st Line of V and U 

(Odd and Even 

M + 4 

Fetch 2nd Line of V and U 

Fields) 

M + 6 

Fetch 3rd Line of V and U 


N + 4 

Fetch Last Line of V and U 

4x Non-Interlaced 

M 

Fetch 1 st Line of V 


M + 1 

Fetch 1st Line of U 


M + 4 

Fetch 2nd Line of V 


M + 5 

Fetch 2nd Line of U 


M + 12 

Fetch 3rd Line of V 


M + 13 

Fetch 3rd Line of U 


N + 4 

Fetch Last Line of V 

4x Interlaced 

M 

Fetch 1st Line of V and U 

(Odd and Even 

M + 4 

Fetch 2nd Line of V and U 

Fields) 

M + 8 

Fetch 3rd Line of V and U 


N -f 4 

Fetch Last Line of V and U 


BMX (0000) This command requests a bitmap. 
BMX (0000) is sent after horizontal active stops, be- 
ginning on the fifth line after vertical active starts, 
and continuing until the fifth line after vertical active 
stops. (There is a vertical pipeline delay of five lines 
through the 82750DB, due to internal timing require- 
ments.) A line programmed to start at line M, wil 
have its first active line displayed at line M + 5. The 
82750PB uses an internal pointer to cause the 
VRAM shift registers to be loaded with pixel values. 
The 82750DB subsequently fetches them as re- 
quired for display. This command is asserted on the 
VBUS for the user-programmed number of T-cycles 
and must be completed before active display begins. 

YBMNPX (0100) This command performs a Y bit- 
map transfer without performing a pitch calculation. 
When the line replicate mode is selected by Bit 22 in 
the Miscellaneous Control register, this code is as- 
serted every other display line so that the same line 
of information can be used twice. 
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Digitizer Commands 

When in the line replicate mode, and digitizing an 
NTSC source (for example, when genlocking an 
NTSC source to a system that uses only a VGA 
monitor), each line of captured data is effectively 
output at twice the rate. Since each line need only 
be stored once In memory (it is dupllcted automati- 
cally in the display mode) only one WRDIGI code, 
followed by a WRDIGINP, is sent every other line. 
On alternate lines, two WRDIGINP are sent and will 
select the last address that was written, without in- 
crementing the 82750PB bitmap address pointer. 
This is described in detail in Chapter 3. 

WRDIGI (0011) This command requests a write of 
digitized data. The operation of this command is de- 
pendent upon the external hardware and is dis- 
cussed in the section on genlocking (page 29). If 
digitizing is enabled, this command is asserted on 
the VBUS for a programmable number of T-cycles. 
The pointer is then incremented by a pitch value. 
Since each horizontal line is stored in a single row of 
memory, this pitch value is equal to the horizontal 
resolution, In bytes, for non-interlaced bitmaps. For 
interlaced bitmaps, the pitch value is equal to twice 
the horizontal resolution, in bytes. This allows alter- 
nate lines of data to be skipped over In successive 
fields. 

WRDIGINP (0111) This command allows access 
to digitized data without performing a pitch calcula- 
tion. WRDIGINP (0111) requests that the 82750PB 
perform a transfer request at the last calculated ad- 
dress. Note that only a memory transfer cycle is per- 
formed — the pitch value is not added to this ad- 
dress. This will always ensure that the digitized data 
is written into the last selected memory address, in 
case a physical memory boundary has been 
crossed. This command Is asserted after the WRDI- 
GI transfer has completed. 

Refresh and Control Commands 

The following signals are used to pass refresh re- 
quests and control information to the 82750PB. 

DFL (1000) The Display Format Load command is 
a maskable host processor interrupt that can be pro- 
grammed to occur at any time during the display. 
This is used by the 82750PB to transfer the shadow 
register contents into the working register set in the 
VRAM interface. This is useful in supporting split- 
screen-type applications, where it is desirable to 
change the bitmap pointers at some point before the 
end of the display.’ 


82750DBSD (1001) This command is the 

82750DB Shut Down code. During every register 
transfer, the 82750DB keeps an internal vertical ex- 
clusive-or checksum of the register data as it is read 
onto the chip. The last word of data that is read 
during the register transfer is the user-generated 
checksum. If the two checksums match, operation 
proceeds as normal. If they do not match, the 
82750DB enters the reset state and sends this code 
to the 82750PB. The 82750DB will remain reset until 
the reset pin is asserted and negated by the host, 
processor. 

REFRESH (1010) This command asks the 
82750PB to generate up to 15 refresh cycles every 
horizontal line. The 82750DB transfer cycles have a 
higher priority than refresh requests in the 82750PB. 
REFRESH will not be asserted if programmed to oc- 
cur at the same time as a transfer request code. 


Video Synchronization Information 

The following codes are used to pass the video line 
and field information from 82750DB to the pixel 
processor. 

VEVEN (1101) This code indicates the start of an 
even (i.e. second) field of a frame. This command is 
sent coincident with line one of each even field. 
When genlocking to an external source (see pg. 29), 
the occurence of a vreset signal during programmed 
horizontal active time will cause the 82750DB to out- 
put a VEVEN code on the VBUS. 

VODD (1100) This code indicates the start of an 
odd (i.e. first or only) field of a frame. This command 
Is always sent immediately after RESETS# is neg- 
ated, and coincident with line one of the odd field. 
Similarly, when genlocking, the occurence of a 
vreset signal during any time other than horizontal 
active time will cause the 82750DB to output a 
VODD code on the VBUS. 

HUN (1110) This code marks every horizontal line 
at a programmable point in the line. HLIN is used by 
the 82750PB to Increment Its horizontal line counter. 
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Pixel Processing Path 

This logic accepts the 32-bit word from the input 
latch and divides the word into the programmed pix- 
el format. This will result in either four 8-bit pixels, 
two 16-bit pixels, one 32-bit pixel, or an 8-bit pixel 
with an 8-bit alpha value (pseudo 1 6-bit mode). The 
pixels act as addresses to the color table, or may 
bypass the table completely as described below. 

Pixel information may be mixed with the output of 
the VU interpolator, which outputs interpolated sam- 
ples derived from a reduced sample bitmap. The 
least significant bit of Y or LSB of U can be pro- 
grammed to act as a switch between using the ex- 
plicit pixel value of YUV or using the luminance por- 
tion of the pixel with the VU portion obtained from 
the interpolator. If the value of the LSB of Y (or U, 
whichever is selected) is zero, the pixel data is used. 
If the LSB of Y (or U) is one, the output of the VU 
interpolator is used. Note that if the LSB of Y is used 
as the switch flag, the luminance portion of the word 
will be only 7 bits wide. 

The alpha Information is also processed in this 
block. The alpha data may come from one of two 
sources: It may be explicitly coded in the pixel word, 
as is the case In the 32-bit/pixel and pseudo 16-blt/ 
pixel mode, or it may be obtained by comparing the 
Y portion of the pixel with a preprogrammed value 
and outputting one preprogrammed value if they 
match and a different value if they do not match. 
This latter capability is known as Alpha Trap. 


VU Interpolation 

When VU interpolation Is enabled by the program- 
mer, and when the display is in the active region, 
“VU data” will be fetched, as required by the inter- 
polator (by the mechanisms discussed previously in 
the section titled "VBUS Code Description”). This 
data has the format V, V, . . . , V, U, U, . . . , U where 
each V or U is 8 bits, and the bytes are grouped into 
32-blt double-words with earliest In lowest order. 
The number, “N”, of V bytes and U bytes is the 
same; N Is programmed to be either 256 samples, or 
one of 32 to 1 92 samples in 32-byte increments. 

The first V data and the first U data fetched on the 
first line of VU interpolation supplies the VU value for 
the first active pixel on that line. All the other VU 
pairs that are fetched define values for the grid of 
pixels defined below and to the right of this one by 
the VU expansion factor every other or every fourth 
horizontally and vertically. Most other VU values are 
filled in recursively by interpolation. Wherever there 
is a pixel which lies between two pixels with known 


values, it is given the value of the weighted average 
of the known values. Values are understood to be 
non-negative integers. When the final value is out- 
putted, any fractions are truncated or rounded to the 
closest odd integer according to the programmed 
value of the interpolation round flag. This process is 
iterated until all pixels have assigned color values. If 
the number of VU data samples loaded into the 
82750DB is not enough to cover the active display 
area, then the last data sample will be replicated 
horizontally across the active display window. 

As mentioned previously in the VBUS Control dis- 
cussion, each line of VU data can be used twice by 
setting the Line Replicate bit in the Miscellaneous 
Control register. Also, each horizontal VU sample 
can be replicated by setting the VU Replicate bit in 
the Pixel Control register. This will cause the V and 
U pixels generated by the VU interpolator every pixel 
time to be used twice. This can result in an effective 
8X horizontal expansion, which is useful when hori- 
zontal blanking time is at a premium. This bit affects 
the horizontal interpolation algorithm only, and will 
not affect the line loading sequence for VU during 
the active display. 

When interpolation Is turned on by the programmer 
(by specifying a non-zero number of samples to be 
fetched), VU interpolation may nevertheless be dis- 
abled for each pixel if the following conditions are 
met: 

1. Conditional interpolation has been selected by 
the programmer, 

AND 

Either of the two user-programmed conditions: 

a. Switching on the LSB of the U bit has been 
selected, and the lowest-order bit of the U val- 
ue fetched for the upper left pixel in the block 
has value zero. This allows switching to occur 
on a 2 X 2-pixel or 4 x 4-pixel grid, depending 
on the expansion mode the user has selected. 
The full 8 bits of Y and V are used, but the 
usable space of U has been decreased to 7 
bits. 

b. Switching on the LSB of the Y bit has been 
selected, and the low order bit of the Y value 
for the current pixel has a value of zero. 

2. Display of fetched and interpolated VU values 
may also be suppressed by setting the Interpola- 
tion Output Enable bit (in the miscellaneous con- 
trol register) to zero. This will allow VU data to be 
loaded into the VU line stores without displaying 
VU data. This is useful when a mid-screen tran- 
sition is made between two interpolation modes, 
to compensate for the vertical latency of the In- 
terpolation process. 
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Colormap Lookup Table (GLUT) 
Operation 

The 82750DB contains three 256 x 8-bit color look- 
up tables. The color maps can be accessed sepa- 
rately, or may act as one large 256 x 24-bit table. 
The manner in which the tables are addressed is 
determined by the programmed bits/pixel and de- 
pends on whether the pixel Is a graphics or video 
pixel. Also each Y, U, and V color table address can 
be masked. The masks can be used in all the bit/ 
pixel modes, but are most useful with the 16-bit/pix- 
el mode. In this mode, the mask allows the YUV 
values to be mapped to 8-bit values instead of 6-5-5. 

Each channel (Y, U, V) has a MASK SET register 
and a MASK DATA register that selects the color 
lookup address bit to be changed and the new value 
of the bit, respectively. A simple mask operation on 
one channel is lllustated in Figure 2-8. 

The GLUT address mask operation is determined by 
a logical equation given by: 

Result = (mask set and mask data) | (mask set and data byte) 


For modes that require both video and graphics to 
pass through the color table, the table can be split 
into two halves: one half for graphics and the other 
for video pixels. By using the SPLITCLUT bit in the 
Miscellaneous Control register in conjunction with 
the LSB of Y or U, the color table address is forced 
to either the video table or graphics table automati- 
cally. In this case, the masking operation is still used, 
but the address is forced to either an even or odd 
entry, regardless of the results of the masking oper- 
ation. The flag bit that decides between the two 
types of pixels automatically selects the correct por- 
tion of the GLUT table for a single channel. Note the 
LSB of Y or U selects the proper half of the GLUT for 
that single component. The SPLIT GLUT mode as- 
sures the proper half of the GLUT is used for all 
three components. 

The color table can be bypassed completely when 
displaying either graphics or video, independent of 
the programmed bits/pixel. This is programmed by 
the, user via the VIDEO PASS and GRAPHICS PASS 
bits in the Miscellaneous Control register. Table 2-3 
summarizes the various modes when using the 
GLUT. 


Each bit of the Result byte is determined individually 
by this equation. The Result byte is then further, pro- 
cessed in order to produce the GLUT RAM address. 
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Figure 2-8. Mask Operation on GLUT Address 


Table 2-3. GLUT Modes 


Graphics 

Pass 

Video 

Pass 

LSB Y or U 

SPLITGLUT 

Golormap Address 

0 

X 

0 

0 

Masked Graphics Data 

1 

X 

0 

X 

Graphics Pixels Bypass GLUT 

X 

0 

1 

0 

Masked Video Data 

X 

1 

1 

X 

Video Pixels Bypass GLUT 

0 

X 

0 

1 

Even Address Only (Graphics) 

X 

0 

1 

1 

Odd Address Only (Video) 

1. 

1 

X 

X 

GLUT Not Used at All 


1-20 





82750DB 


iny. 


When writing to the GLUT, the most significant byte 
of the data word corresponds to the address, and 
the least significant 24 bits are the YUV data (least 
significant to most significant, respectively). An in- 
dex register is used to allow the 6-bit address to be 
mapped to an 8-bit number. (Refer to Chapter 4 for 
more information.) By resetting the 82750DA Dis- 
able bit, it is possible to make the GLUT look like the 
reduced entry color lookup table on the 82750DA. 

The following paragraphs summarizes the possible 
bit/pixel modes, using the LSB of Y or U switching 
ability and the various graphics and video bypass 
modes. Note that there are modes where the LSB of 
Y or U are not used to switch between graphics and 
video. 


8-BIT/PIXEL GRAPHICS MODE 

This is the graphics-only mode, in which the 8 bits 
are used as inputs to all three color tables. This 
makes the color maps look like a single, 256 x 24-blt 
GLUT and allows 256 unique colors from a palette of 
16 million to be available at any given time. If the 
Graphics Pass bit is asserted, the GLUT will be by- 
passed and the 8-blt values of the Y, U, and V chan- 
nels will be input to each channel of the converter 
matrix. 


8-BIT/PIXEL VIDEO MODE 

When used with subsampled VU information from 
the interpolator, the 8 bits are actually a luminance 
value. The Y portion addresses the Y color table, V 
the V color table, and U the U color table. By using 
the color table, a one-to-one mapping exists, allow- 
ing non-linear transformations to be applied to the 
pixel data to enhance the quality of the reconstruct- 
ed image. By asserting the VIDEOPASS bit in the 
Miscellaneous Gontrol register, the color table can 
be bypassed. 


8-BIT/PIXEL MIXED MODE 

In the 8-bit/plxel mixed mode the LSB of Y or U is 
used as a switch flag to change the index to the 
color tables. When the switch flag is set to a one, 
the Y value corresponds to a luminance value, and 
the VU values are the chrominance information ob- 


tained from the VU interpolator. In this case each 
video component Is used as an address to its corre- 
sponding GLUT as described above. When the 
switch flag is set to a zero, the VU values are not 
used and the Y value is used as the address to all 
color tables. These pixels are treated the same as in 
the 8-bit/pixel graphics mode. 

In this mode the applications programmer must en- 
sure that the proper information has been loaded 
into specific areas of the color maps. For example, 
all the video pixels will use the odd address values. 
By restricting the address used in the graphics and 
video mode, two unique maps may coexist in the 
tables. One map is used for non-linear transforma- 
tions on video data, and the other for graphics color 
lookup table applications. 

As illustrated above, the GLUT can be bypassed by 
asserting either or both of the bypass controls. 

PSEUDO 16-BIT/PIXEL GRAPHICS MODE 

In the pseudo 16-blt/pixel graphics mode each 
32-bit data word is made up of two, 16-bit pixel 
words. The 82750DB processes each 16-bit pixel 
word, so that the least significant 8 bits correspond 
to pixel information, and the most significant 8 bits 
are used as alpha information. The 82750DB uses 
the lower 8 bits as Inputs to all three color tables. 
This makes the color maps look like a single, 256 x 
24-blt color table. If the Graphics Pass bit is assert- 
ed, the GLUT will be bypassed and the 8-bit values 
of the Y, U, and V channels will be input to each 
channel of the converter matrix. 


PSEUDO 16-BIT/PIXEL VIDEO MODE 

When used with subsampled VU information, the 
least significant 8 bits of the pixel word are actually a 
luminance value. The most significant 8 bits are 
used as alpha information. The VU information is 
generated by the 82750DB interpolator. Each of the 
color maps uses the corresponding 8-bit video com- 
ponent as an addess. By asserting the Video Pass 
bit in the Miscellaneous Gontrol register, the color 
table can be bypassed. 
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PSEUDO 16-BiT/PIXEL MIXED MODE 

In this mode the LSB of Y or U is used as switch flag 
to change the Index to the color tables. When the 
LSB of Y or U is set to a one, the lower 8-bit value 
corresponds to a luminance value,, and the V and U 
values are the chrominance information. In this 
case, each video component of the 82750DB is 
used as a colormap address as described above. 
When the LSB of Y or U is set to zero, the V and U 
values from the Interpolator are not used, and the Y 
value is used as the address to all color tables. 


16-BIT/PIXEL GRAPHICS MODE 

The 16-bit pixel word is broken up on the 82750DB 
to yield 6 bits of Y, and 5 bits each of V and U. The Y 
bits are the least significant, and the U bits are the 
most significant. These values are then padded with 
zeros in the lower order bits, to obtain an 8-bit word 
for each pixel component. Each component ad- 
dresses its respective GLUT. However, the Y chan- 
nel may access only 64 unique locations, and 5-bit 
resolution tor VU restricts them to 32 unique loca- 
tions each. The address range may be extended by 
using the colormap mask registers to add 2 bits of 
precision in the least significant bits for Y and 3 least 
significant bits each for VU channels. This allows the 
programmer to access all the entries in the color 
table by reprogramming the MASK DATA and MASK 
SET registers during the blanking Interval. 

16-BIT/PIXEL VIDEO MODE 

This mode works like the 8-bit/pixel video mode de- 
scribed above, except that the 82750DB has pro- 
cessed the information so that the Y channel con- 
tains the least significant 8 bits of the 16-blt data 
word. The V and U information is generated by the 
VU interpolator. If the SPLITCLUT mode is selected, 
the LSB of the address is forced to an odd entry in 
the three color tables. 

16-BIT/PIXEL MIXED MODE 

When the switch flag is zero, the graphics mode is 
selected and the inputs to the GLUT are the respec- 
tive YUV data in the 6-5-5 format. These pixel values 
are extended by using the colormap masking regis- 


ters. When the switch flag indicates the video mode, 
the lower 8 bits of the 1 6-blt pixel word and the VU 
values obtained from the interpolar are input to their 
respective GLUTS. If the SPLITGLUT mode is select- 
ed, the LSB of the address is forced to either an odd 
or even entry In the three color tables, depending on 
whether the data is video or graphics information. 

32-BIT/PIXEL GRAPHICS MODE 

Eight bits each of Y, U, and V are used as addresses 
to each segment of the color table. Since the size of 
the addressable color space Is not Increased, the 
advantage of using the color map Is for special ef- 
fects or gamma correction. The most significant 8 
bits of the 32-bit data word are used for the alpha 
channel data. If the Graphics Pass bit Is asserted, 
the GLUT will be bypassed and the 8-bit values of 
the Y, V, and U will be Input to each channel of the 
converter matrix. 


32-BIT/PIXEL VIDEO MODE 

The Y channel contains the least significant 8 bits of 
the 32-blt data word. The U and V Information is 
generated by the VU interpolator. The YUV channels 
are input to their' respective color tables. The size of 
the addressable color space Is not increased, but 
this can be used to take advantage of a non-linear 
transformation, which may aid in the decompression 
process. The most significant 8 bits of the data word 
are used for the alpha channel data. 

32-BIT/PIXEL MIXED MODE 

When the switch flag Is zero, the graphics mode is 
selected, and the Inputs to the GLUT are the respec- 
tive 8 bits each of YUV data. These pixel values may 
be masked by using the colormap mask data and 
mask set registers. When the switch flag indicates 
the video mode, the lower 8 bits of the pixel word 
and the VU values obtained from the interpolator are 
Input to their respective GLUTs. If the SPLITGLUT 
mode Is selected, the LSB of the address is set to 
either an odd or even entry in the three color tables, 
depending on whether the data is video or graphics 
information. The most significant 8 bits of the data 
word are used for the alpha channel data. 
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Y Interpolator 

The Y Interpolator performs a 2X horizontal linear 
interpolation on each line of Y values. When Y inter- 
polation is enabled, the internal pixel clock Is twice 
the frequency of PIXCLK output. 

NOTE: 


If Y interpolation is enabled, then only the integer 
values of pixel times greater than 1X may be 
used. 


The interpolation may be separately controlled for 
both video and graphics pixels, via the Viden and 
Gren bits (bits 12 and 11) of the General Control 
register. A video pixel is defined as one generated 
using VU interpolated values. A graphics pixel does 
not use the VU interpolator. The effects of setting 
the control bits, the 82750DB enable flag, and vid- 
eo/graphics pixel switch (V/G Switch) on the output 
of the interpolator are summarized in Table 2-4. 

Because of the asymmetric nature of the Internal 
pixel clock used on 82750DB, the number of T-cy- 
cles between successive Y pixels varies depending 
on the programmed pixel width. When enabled, 
there is a pipeline delay through the Y Interpolator 
equal to the number of T-cycles between each inter- 
nal pixel clock. 

When the interpolator is bypassed as described 
above, there is a fixed delay through this block. The 
V and U data are delayed by one pixel clock to allow 
the chroma data to line up with the luminance data. 
Other control signals, such as the register address 
byte (most significant byte of the 32-bit data word 
read from VRAM), the pixel clock, horizontal and 
vertical active displays, composite blanking, and reg- 
ister load enable signals are also delayed by one 
pixel clock in order to line up with the YUV data. The 
programmer must ensure that the active display tim- 
ing is programmed to take the appropriate delay 
through the Y Interpolator into account. 


Table 2-4. Control Bit Settings and 
Resulting Interpolator Output 


82750DB 

Enable 

Viden 

Gren 

V/G 

Switch 

Result 

0 

X 

X 

X 

Interpolator 

Bypassed 

1 

0 

0 

X 

Interpolator 

Bypassed 

1 

0 

1 

0 

Interpolate 
Graphics Pixel 

1 

0 

1 

1 

Do Not 
Interpolate 
Video Pixel 

1 

1 

0 

1 

Interpolate 
Video Pixel 

1 

1 

0 

0 

Do Not 
Interpolate 
Graphics Pixel 

1 

1 

1 

X 

Interpolate 

Both Video 
and Graphics 
Pixels 


Cursor 

Hardware support for a 16 x 16-pixel cursor has 
been included on the 82750DB. The cursor is capa- 
ble of providing sharp color transitions, when using 
subsampled VU bitmaps. Software intervention is 
minimized, leaving the host with more processing cy- 
cles to perform other operations. 

Under normal operation, the XY starting display po- 
sition of the cursor is loaded Into the Cursor Control 
register during a 82750DB register load. On the dis- 
play line corresponding to the Y start position, the 
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cursor is displayed when the X starting position 
(specified in T-cycles) is reached. On the following 
15 lines, the cursor will be displayed at this X posi- 
tion every line, for both interlaced and non-inter- 
laced displays. 

A normal 82750DB register transfer is used to load 
the entire 16x16x2 bits (16 words of 32 bits each) 
of cursor data. During this register transfer, the cur- 
sor data is distinguished from normal register data 
by placing the Cursor Control register immediately 
before the 16 words of cursor data. When the 
82750DB loads the Cursor Control register, it will in- 
terpret the next sixteen 32-bit words of register data 
as the cursor bitmap, and will disable the other regis- 
ters on the 82750DB from decoding the address 
field of the 32-bit data word. (The checksum of the 
82750DB register data is not performed during the 
loading of the cursor bitmap data.) The cursor bit- 
map will be loaded a line at a time, starting at line 
zero and continuing In sequential order to line 1 5. 
Each line in the cursor map actually contains sixteen 
2-bit cursor pixels, with the two least significant bits 
corresponding to the first cursor pixel in that line, 
and the two most significant bits corresponding to 
the 16th cursor pixel on that line. Each 2-bit pixel 
may select one of the three Cursor Color registers or 
transparency, according to the format indicated in 
Table 2-5. 


Table 2-5. Cursor Color Registers 


Cursor Pixel 

Output 

00 

Transparency 

(Cursor Pixel Not Displayed) 

01 

Cursor Color Register 1 

10 

Cursor Color Register 2 

11 

Cursor Color Register 3 


Three 24-bit color registers that hold the color infor- 
mation for the cursor may be written to at any time 
during the register load. The cursor may be loaded 
any time during the blanking intervals of the display. 
For displays that do not program the cursor during 
the display, the cursor bitmap may be loaded during 
the vertical blanking interval. 

When the T-cycle count equals the value pro- 
grammed into the X start position of the Cursor Con- 
trol register, the first cursor pixel can be displayed. 


Each 2-bit cursor pixel will select one of the three 
Cursor Color registers or transparency. The 24-bit 
output of one of the three color registers (or the ac- 
tual display pixel data if transparency Is used) Is in- 
put to the YUV converter. 

The cursor bitmap length is 16 lines, and the width is 
16 pixels. Although the length of the cursor may be 
changed dynamically by chaining register loads to 
update the cursor map, the size of the cursor is de- 
pendent on the type of display. For interlaced dis- 
plays, each line of cursor data will appear on the 
same line of each field. This results in a cursor of 
16 X 32 pixels. For non-interlaced displays, the same 
line of cursor information will appear on the same 
line every field. The cursor in this case will be 16 x 
16 pixels. The size of the cursor may be doubled 
independently in the horizontal and/or vertical direc- 
tion by setting the 2X Horizontal Cursor or 2X Verti- 
cal Cursor bit in the General Control register. In this 
case, no new data is loaded into the cursor map; the 
data is just replicated In the corresponding dimen- 
sion. Table 2-6 summarizes some of the possible 
cursor sizes. Note that by loading the cursor bitmap 
with different data at the start of every field, cursor 
sizes not listed below may be achieved. 


Table 2-6. Cursor Sizes 


2X Horz. 
Cursor 

2X Vert. 
Cursor 

Display 

Cursor Size 
(in Pixels) 

Off 

Off 

Interlaced 

16x32 

On 

Off 

Interlaced 

32x32 

Off 

On 

Interlaced 

16x64 

On 

On 

Interlaced ■ 

32x64 

Off 

Off 

Non-Interlaced 

16x16 

On 

Off 

Non-Interlaced 

32x16 

Off 

On 

Non-Interlaced 

16X32 

On 

On 

Non-Interlaced 

32x32 


There is a complex relationship between the cursor 
and the pixel data especially when using non-inte- 
gral divisors of the pixel clocks. Since the pixel data 
output from the 82750DB pixel path always changes 
coincident with the rising edge of the clock, the cur- 
sor start position must be positioned on the rising 
edge of any period of the pixel clock. The program- 
mer must enforce the corresponding restrictions on 
the start and stop position of the cursor. 
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YUV to RGB Converter 

The following equations give the theoretical relation- 
ship between analog RGB components, R, G, B, and 
analog YUV components, Y, U, V. 

Y = 0.298822 R + 0.586816 G + 0.114363 B (1a) 

V = R - Y = 0.701178 R - 0.586816 G - 0.1 14363 B (1b) 

U = B - Y = -0.298822 R - 0.586816 G + 0.885637 B (1c) 

where: 0.0 < G, R, B < 1.0 
0.0 < Y < 1.0 
-0.701 < V < +0.701 
-0.886 < U < -0.886 


When converting the normalized analog values Y', 
V', U' to digital y, v, u values, the D.C. offset and 
conversion ranges are compatible with the CCIR 
601 standard for digital video. The ranges for the 
components and the corresponding Digital to Ana- 
log equivalent equations are given below: 


y = (235 - 

16) Y' 

+ 

16 

(4a) 

where: 

16 

< y 

< 

235 


V = (240 - 

16) V' 

+ 

16 

(4b) 

where: 

16 

< V 

< 

240 


u = (240 - 

16) U' 

+ 

16 

(4c) 

where: 

16 

< U 

< 

240 




Solving for G, R, B, we can obtain the inverse rela- 


tionship: 

G = Y - 0.509228 V - 0.194888 U 

(2a) 

R = Y + V 

(2b) 

B = Y + U 

(2c) 

where: 0.0 < G, R, B < 1.0 

0.0 < Y < 1.0 

-0.701 < V < +0.701 

-0.886 < U < +0.886 



Substltuting the normalized analog voltages of 
Equation 3 into Equation 4, we obtain the digital ver- 
sion of the input data, used in the DV|tm Technology 


system: 


y = (219) Y + 16 

(5a) 

112V 

(5b) 

u = ^+i2e 

0.886 

(5c) 

where: 0.0 < Y < 1.0 



The luminance channel for the YUV inputs Is pre- 
sumed to swing between O.OV and 1.0V. However, 
the chroma components do not and need to be nor- 
malized to a OV to IV range. The offset binary en- 
coding used to obtain unsigned numbers must also 
be accounted for. This encoding should center the V 
and U Inputs at the midpoint of the voltage range. 
The equations for the normalized version of Y, V, 
and U (Y', V', and U' respectively) are: 


-0.886 < U < 0.886 
-0.701 < V < 0.701 
16 < y < 235 
16 < V, u < 240 

By solving equations 5 for Y, U, V, and substituting 
Into Equation 2, we get the relationship between an- 
alog R, G, B and the digital DVI y, u, v data: 


Y' = Y 


(3a) G = 0.004566 y - 0.003187 v - 0.001541 u + 0.532242 (6a) 


V' 


0.5V 

0.701 


Where: 0.0 < Y', V' U' < 1.0 
0.0 < Y < 1.0 
-0.701 < V < +0.701 
-0.886 < U < +0.886 


(3b) 

R = 0.004566 y + 0.006259 v - 0.874202 

(6b) 


B = 0.004566 y + 0.007911 u - 1.085631 

(6c) 

(3c) 

where: 0.0 < R, G, B < 1 .0 



16 < y < 235 



16 < V, u < 240 
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If the inputs of the^ Digital to Analog Converter are 
scaled to accommodate the nominal input range of 0 
to 219, we obtain the following relationship between 
the inputs to the DVI Technology system, (y, v, u) 
and inputs to the Digital to Analog Converters (r, g, 
b). Note that all out of range RGB values (> 255 or 
< 0 due to excursions in the inputs) are clipped to 
255 or 0. 

g = y - 0.698001 v - 0.337633 u + 1 16.561 16 (7a) 

r = y + 1.370705 V - 191.45029 (7b) 

b = y + 1.732446 u - 237.75314 (7c) 

where; 16 < y < 235 

16 < V, u < 240 
0 < g, r, b < 255 

By substitution of Equation 5 into Equation 1 , and by 
converting G, R, and B to digital values, we can ob- 
tain the inverse relationship of Equation 7: 

y = -f- 0.298822 r + 0.586816 g + O.i 14363 b + 16 (8a) 

u = -0.172486 r - 0.338721 g + 0.511206 b + 128 (8b) 

jpv = + 0.51 1 545 r - 0.4281 1 2 g - 0.083434 b + 128 (8c) 

where: 16 < y < 235 

16 < V, u < 240 
0 < g, r, b < 255 

Output Equalization 

The units on the 82750DB process the pixel informa- 
tion at the operating frequency of the chip. If the 
output pixel rate is not equal to the maximum fre- 
quency, the units have null states during which pro- 
cessing Is suspended. This type of operation is nec- 
essary on the 82750DB because of the large 
amount of pipelining. Table 2-7 gives the pattern of 
T-cycles on the 82750DB during which processing is 
active, according to the programming shown in Ta- 
ble 4-2. 

The pixel information must be output at a rate that Is 
some sub-multiple of the operating frequency. The 
divisor is programmed by the user, and may be from 
1 to 12 times slower than the period of FREQIN, in 
increments of Vi. Divisors of 13 and 14 are also pro- 
grammable. Because non-integral divisors are used, 
it is necessary for the 82750DB to output different 
Information on both phases of FREQIN. This is illus- 
trated in Figure 2-9, which uses a 2.5 divisor for the 
clock. Notice that the pixel clock output (PIXCLK) 


transitions fall alternately on the active and inactive 
phase of the input frequency, while the internal pixel 
clock transitions always occur on the active phase. 
Also note that PIXCLK does not have a 50% duty 
cycle. 

The equalizing logic derives a clock that has a peri- 
od equal to the programmed pixel rate, providing an 
edge to sample the output information. This allows 
the Digital to Analog Converter to directly sample 
the output of the pixel data path before performing 
the analog conversion. 


Table 2-7. 82750DB Active T-Cycle Patterns 


Pixel Time 
(T-Cycles) 

Pattern Of Internal 

Pixel Clock 

1 

Always On 

1.5 

1 On/1 On/1 Off 

2 

1 On/1 Off 

2.5 

1 On/1 Off /I On/2 Off 

3 

1 On/2 Off 

3.5 

1 On/2 Off/1 On/3 Off 

4 

1 On/3 Off 

4.5 

1 On/3 Off/1 On/4 Off 

5 

1 On/4 Off 

5.5 

1 On/4 Off/1 On/5 Off 

6 

1 On/5 Off 

6.5 

1 On/5 Off/1 On/6 Off 

7 

1 On/6 Off 

7.5 

1 On/6 Off/1 On/7 Off 

8 

1 On/7 Off 

8.5 

1 On/7 Off/1 On/8 Off 

9 

1 On/8 Off 

9.5 

1 On/8 Off/1 On/9 Off 

10 

1 On/9 Off 

10.5 

1 On/9 Off/1 On/1 0 Off 

11 

1 On/1 0 Off 

11.5 

1 On/ 10 Off/ 1 On/ 11 Off 

12 

1 On/11 Off 

13 

1 On/ 12 Off 

14 

1 On/ 13 Off 
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Digital to Analog Converters 

The Digital to Analog Converters (DACs) take three 
channels of video information output from the pixel 
data path, converting It from 8-blt digital values to 
analog voltage levels typically between OV and 1V. 
The conversion is monotonIc, and a pixel clock is 
used to derive a two-phase clock internal to the 
DAC. The data is sampled from the output of either 
the pixel path, or the YUV to RGB matrix on the 
rising edge of the Internal active phase of this clock. 
The DISDAC input pin can be asserted to disable the 
analog outputs and place them into a high-imped- 
ance state. 


The analog outputs of the triple DAC are. referenced 
to an external current source, which must be con- 
nected to the IREFIN pin. All the analog outputs are 
scaled by this current reference. The value of the 
analog output full scale is as follows: 


where: Iref is the magnitude of the reference 
current. 

The output voltage generated at full scale Is: 

Vfs = Ifs * Rext 

Rext is the load resistance value. 

A typical output load for the analog outputs (RV, BU, 
GY) Is 75Q. The speed of the DAC analog output 
rise and fall times is determined by the time con- 
stant: 


where: Cext is the external capacitance applied and 
Cout is the intrinsic capacitance of an ana- 
log output. 

For high performance the objective would be to 
minimize Rext and Cext. The voltage Voutfs can be 
determined by any combination of Ifs and Rext, but 
must not exceed 1.5V. In addition Ifs must not ex- 
ceed 22 mA. The analog outputs must go through 
an external buffer to drive doubly-terminated 75Q 
coax line. 

Table 2-8 lists pins which are used to configure the 
triple DAC. 


Table 2-8. Digital To Analog Converter Pins 


Signal 

Description 

IREFIN 

Analog Current Reference. Must Be 
Decoupled to AVCC. 

VGCS 

Internal Voltage Reference. Must 

Be Decoupled to AVCC. 

AVcc 

Analog Power 

AVss 

Analog Ground 

GY, RV, BU 

Analog Pixel Outputs 

DISDIG 

Disable Digital Outputs 

DISDAC 

Disable Analog Outputs 


NOTE: 


The digital video outputs must be disabled by 
setting DISDIG high whenever the analog out- 
puts are used. Otherwise the A.C. and D.C. char- 
acteristics of the DAC are not guaranteed. 


Rext * (Cext + Cout) 
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3.0 HARDWARE INTERFACE 


82750DB Reset Operation 

Upon power-up, the 82750DB is in an indeterminate 
state and must be reset. The RESETS# signal as- 
serted by the host processor is sampled on the ris- 
ing edge of FREQIN. The 82750DB will enter the 
reset state a maximum of four cycles after 
RESETS# is sampled. The 82750DB will request 
the 82750PB to generate VRAM refresh cycles by 
asserting a REFRESH code on the VBUS for 16 T- 
cycles. This code is repeated every 256 T-cycles, 
until RESETS# Is negated. 

NOTE: 


The RESETS* input is an edge-triggered input. 
After power-up, the host processor must set the 
RESETS* input low for a minimum of ten T-cy- 
cies in order to reset the 82750DS. The host 
must then set the RESETS* input high to start 
normal operation. 


When the RESETS# Input is released, a Start of 
Vertical Field command (VODD) is sent for 16 T-cy- 
cles to the 82750PB via the VBUS. This code is im- 
mediately followed by a Register Transfer Request 
command (REGX) that is held for 256 T-cycles. This 
256 T-cycle wait assures that the 82750PB has am- 
ple time to honor the 82750DB register transfer re- 
quest. The register data is then read into the 
82750DB from the serial port of the VRAMs at a rate 
that is equal to Va of the operating frequency. If the 
register transfer does not terminate after 256 T-cy- 
cles, the 82750DB will automatically stop the trans- 
fer, send an 82750DBSD code to the 82750PB, and 
re-enter the reset state. 


the beginning of a horizontal line and at the begin- 
ning of the first field sometimes referred to as line 1 
of field 1 . There will not be a horizontal sync pulse 
on the first line after reset, but HSYNC will be gener- 
ated on every line thereafter. All horizontal and verti- 
cal programming parameters as well as scheduling 
of any transfer requests and control information to 
be sent on the VBUS must be set up by the user 
during the first register load. Included in the control 
information are parameters for the 82750PB to re- 
fresh the VRAM. Refresh must occur on every line. 
This requires that the line rate of the 82750DB must 
be at least 4 kHz to guarantee that enough refresh 
cycles are generated. Additional register transfers 
(up to one per line) may be programmed to occur on 
any line during the field. As a result of this transfer 
display characteristics and programming parameters 
may be changed. 

After the first field, automatic register transfers will 
occur on the second line of each subsequent field. 
Note that all register transfers will occur at 1/3 of 
the operating frequency of the 82750DB, unless the 
IX or 1/2X SCLK mode has been programmed by 
the user. 

Throughout the reset process, the states of all out- 
puts become valid at various times. Specifically, af- 
ter being held low for at least 10 T-cycles, 
RESETS# must transition to a high state In order 
to initiate normal operation. By the time RESETS# 
reaches this low to high transition, the states of 
SCLK[1:0], VBUS[3:0], HSYNC, VSYNC, CSYNC, 
and FCO are valid. 10 T-cycles following 
RESETS #’s transition from low to high, the states of 
BG, CB, ACTDIS, PIXCLK, DGY[7:0], DRV[7:0], and 
DBU[7:0] become valid, ALPHA[7:0] and BPP[1:0] 
signals reach a valid state 10 T-cycles following the 
completion of the first register load following reset. 


During this register transfer, and on all subsequent 
register transfers (programmed or automatic), the 
82750DB performs a vertical checksum on the regis- 
ter data. The last 32-bit word read in during a regis- 
ter transfer is the user-generated checksum of that 
register data. If the 82750DB-generated checksum 
error does not match the user-generated checksum, 
the 82750DB sends a SHUTDOWN code to the 
82750PB via the VBUS, and will automatically re-en- 
ter the reset state. The 82750DB will remain in the 
reset state until the RESETS# input is toggled by 
the host processor. Any VRAM requests or control 
signals programmed to occur during this time will be 
ignored. 

Normal programmed operations start after the first 
successful register load. Frame timing will start at 


Input/Output Transformation 

In general, the control outputs, including the sync 
signals, are delayed by pipelining effects from their 
corresponding inputs, if the output sync signals are 
taken as the time base, the first pixel in a line is 
actually fetched by an SCLK that is up to 19 T-cycles 
before its corresponding PIXCLK. Some later pixels 
may be delayed by an additional number of T-cycles, 
depending upon bits/pixels, pixel timing, and wheth- 
er Y interpolation is enabled. 

Outside of the active display region and before the 
blanking output Is asserted, border pixels are output. 
Where the blanking region has been entered and the 
display is not active, the output Is the value con- 
tained in the Blanking Color register. 
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Pixel handling in the active region is defined by three 
parameters: 

1 . The bits/pixel parameter. 

2. Whether VU interpolation is in effect or not. 

3. If the 8275.0DB Enable bit has been selected. 

VU interpolation is in effect for a given pixel if: 

1 . The VU interpolator Is turned on (VU sample load 
set to non-zero load value), 

AND 

2. VU interpolation display is permitted (VU interpo- 
lation display operations bit equals 1), 

AND 

3. One of the two following conditions is met: 

a. Either the interpolation Is unconditional, 

OR 

b. The controlling Y or the controlling U sample 
for this pixel has a least significant bit of 1 . 

The value of the alpha output may come from one of 
the following three sources: 

1 . It may be explicitly coded Into the pixel data (32- 
bit/pixel and pseudo 16-bit/pixel with Alpha 
modes only). 

2. It may be output from one of two programmable 
registers, AlphaO and Alphal . 

3. During the portion of the display when the border 
Is active, the 8 most significant bits of the Border 
Alpha register may be output. 

Table 3-1 Illustrates how the Alpha outputs are se- 
lected. 


Table 3-1. Selecting Alpha Outputs 


Alpha 

Enable 

Alpha 
Trap Select 

Alpha Output 

0 

X 

AlphaO Register 

1 

0 

AlphaO Register 
(8, 16bpp) 

1 

0 

MS Byte of Pixel 
(32, Pseudo 16 bpp) 

1 

1 

Trap Match = 0, 

AlphaO Register 

1 

1 

1 

Trap Match = 1, 

Alphal Register 


Genlocking on the 82750DB 


The genlocking algorithm on the 82750DB uses hori- 
zontal and vertical resets, HRESET# and 
VRESET#, obtained from an external device. When 
the Genlock bit In the Miscellaneous Control register 
is off, the 82750DB will ignore all signals present on 
it’s HRESET# and VRESET# inputs. The 82750DB 
will resync Itself when the programmed end of line 
count Is received. This allows the user to turn off 
genlock without having to worry about the state of 
the Input video. 

When the Genlock bit is set to one, the 82750DB will 
use the external resets to reset its internal horizontal 
and vertical sync counters. In this case, the width of 
the active line is determined by the HRESET # sig- 
nal, and the length of the field is governed by 
VRESET#. The programmed values for these reg- 
isters will be ignored. As shown in Figure 3-1, 
when asserted VRESET# and HRESET# are ef- 
fected just after the third falling edge of FREQIN. 
VRESET # has no effect on the 82750DB if the first 
half of the first line of an odd field or the second (and 
only) half of the first line of an even field Is already In 
progress. HRESET # has no effect on the 82750DB 
if It occurs during the programmed first half of the 
line. The user may decrease the effect of jitter by 
reducing the “window” during which the vertical re- 
set signal is supposed to occur. This can be done by 
scheduling a register load to occur after the vertical 
active display time has ended, thereby decreasing 
the programmable horizontal active window to a size 
acceptable for the video source. When VRESET# is 
received during this reduced, programmed hori- 
zontal active window, the 82750DB is reset to an 
even vertical field. When VRESET# occurs at any 
other time in the horizontal scan line, the 82750DB 
is set to an odd field. 
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Digitizing Images with the 82750DB 

Digitizing is enabled by setting the Digitize Enable bit 
in the Miscellaneous Control register. Note that en- 
abling the digitize mode does not automatically en- 
able genlocking. The Genlock bit must be set sepa- 
rately, if it is required. When digitizing, the 82750DB 
is used to shift digitized data into the VRAM shift 
registers, and then transfer this data into the VRAM 
array. 

The 82750DB also provides an external “digitizer 
window” signal, FCO. This signal defines the vertical 
active region that the digitizer enabled. Typically, the 
user sets up the display parameters to reflect the 
“window” of the display to be digitized. The horizon- 
tal and vertical active window size can be selected 
by programming the Active Start and Stop registers. 
FCO is derived from the Vertical Start and Stop reg- 
isters, and is used to enable the digitizer to drive the 
VRAM bus. During the programmed vertical blanking 
interval the FCO signal will be negated, and there- 
fore, the digitizer is prohibited from driving the VRAM 
bus. This will allow data to be read from the VRAM 
serial data bus during the automatic register transfer 
that Is performed at the start of the field. Note that it 
will still be possible to program the 82750DB to digi- 
tize during the vertical blanking interval, in order, for 
example, to capture time codes from a VCR. 


When capturing and displaying NTSC data during 
the horizontal blanking Interval of the first display 
line, a WRDIGINP command is sent on the VBUS to 
the 82750PB. (Refer to Figure 3-2.) Recall that there 
Is a 5-line vertical pipeline delay through the 
82750DB. If the first display line is programmed to 
be n, the first display line will occur at n + 5. Similar- 
ly, if the last line is programmed to be m, then the 
last display will be line m + 5. The WRDIGINP 
VBUS code causes a dummy write transfer cycle 
that places the VRAMs in the write mode. The 
82750PB then sets the bitmap pointers to the first 
line’s address (LO). This code is Immediately fol- 
lowed by another WRDIGINP command that causes 
the 82750PB to perform a write transfer cycle at the 
LO address. Since no digitized data has been read 
in, invalid data is loaded into row LO of the VRAM 
array. 

During the active display of the first display line, the 
82750DB provides shift clocks at the programmed 
pixel rate. The digitized data is shifted into the 
VRAMs while the user-programmed horizontal active 
window is active. During the horizontal blanking in- 
terval of the next line, the 82750DB sends a WRDIGI 
code to the 82750PB, thereby transferring the LO 
data from the shift register to the VRAM array at the 
LO address. The 82750PB performs a pitch calcula- 
tion, pointing it to the L1 row. After the WRDIGI 
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WRDIGINP 

f 

WRDIGINP 

WRDIGINP Ploce VRAMS in write mode ' 

Set 82750PB pointer to LO 

WRDIGINP Transfer garboqe to LO address 
(Select LO) 

line n + 4 







rco Asserted 

k 


WRDIGI 

A 

WRDIGINP 

WRDIGI Transfer LD data to LO oddress 

Set 82750PB pointer to LI 

WRDIGINP Tronsfer LO to LI oddress 


T 

1 Digitized Dolo LO | 


A 

(Select LI) 

line n + 5 

U~ 


WRDIGI 

' A 

WRDIGINP 

WRDIGI Tronsfer LI dota to LI oddress 

Set 82750PB pointer to 12 
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A 

WRDIGINP 

A 

WRDIGI Transfer Lf doto to Lf address 

Set 82750PB pointer to Lf+1 

WRDIGINP Tronsfer Lf to Lf+1 oddress 
(Select Lf+1) 



1 Lost Line Of Dolo Lf| 

f 


line m + 5 
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+ 





line m +6 

i_r 
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Figure 3-2. Digitizing Example 


transfer has finished, the 82750DB issues a 
WDIGINP command to the 82750PB that performs a 
write transfer cycle at L1 address. This will write the 
LO data into the L1 address. The next line the L1 row 
will be written over with L1 data. This same proce- 
dure continues for the entire active display, until the 
last active line is reached (m + 5). A final pair of 
WRDIGI and WRDIGINP codes are sent to the 
82750PB to load in the last line of data. At the start 
of horizontal sync of the next line, the FCO signal 
will be negated. 

The purpose of the WDIGINP may not be apparent 
at first glance. This signal ensures that the correct 
data is written into the last selected VRAM address. 
This is necessary when crossing the physical bound- 
aries of VRAM memory. 

When the 82750DB is genlocked, the digitizing 
device must also provide the H RESET # and 
VRESET# signals. The device must ensure that 
VRESET # is never asserted during the start of the 
line. This allows a register transfer (which shortens 
the active display and is required for digitizing) to 
complete before the start of a field register transfer. 


The vertical sync pulses are buffered, so the start of 
the field transfer request can be honored immediate- 
ly after the previous transfer request is finished. 

Also, captured NTSC data may be displayed on a 
VGA-type monitor. This requires the 82750DB to op- 
erate at a VGA frequency (approximately 31.5 kHz), 
which Is twice that of NTSC. Each line of captured 
NTSC data Is read into the 82750DB twice. Setting 
the line replicate bit makes doubling of memory un- 
necessary. Figure 3-3 illustrates how the 82750DB 
operates in such a mode. The Line Replicate, Digitiz- 
er, and Genlock bits In the Miscellaneous Control 
register are assumed to be set to one. During the 
HBI of the first display line, a dummy write transfer 
cycle (WRDIGINP) places the VRAMs In the write 
mode. The 82750PB then sets the bitmap pointers 
to the first line’s address (LO). This code is immedi- 
ately followed by a WDIGINP command, causing the 
82750PB to perform a write transfer cycle at the LO 
address. Since no digitized data has been read in, 
unknown values are loaded into row LO of the VRAM 
array. 
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WRDIGINP 

t 

WRDIGINP 

f 

WRDIGI Place VRAMs in write mode 

Set 82750PB pointer to LO 

WRDIGINP Tronsfer gorboge to LO oddress 
(Select LO) 

line n+4 

i_r 






FCO Asserted 

A 


WRDIGINP 

WRDIGINP 

A 

WRDIGINP Transfer LO dalo to LO oddress 
(Select LO) 

WRDIGINP Transfer LO doto to LO oddress 
(Select LO) 


^ 

Digitized Oato LO 

A 

4 

line n +5 

i_r 


WRDIGI 

WRDIGINP 

WRDIGI Tronsfer LO dolo to LO oddress 

Set 82750PB pointer to LI 

WRDIGINP Transfer LO to LI oddress 


I 

Digitized Dola Lo| 



(Select LI) 

line n +6 
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WRDIGINP 

WRDIGINP 
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WRDIGINP Tronsfer LI doto to LI oddress 


I 
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f 
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Figure 3-3. Digitizing Example with Line Replicate 


At the end of the first line the 82750DB sends two 
WRDIGINP codes to the 82750PB, thereby transfer- 
ring the LO data from the shift register to the VRAM 
array at the LO address. The 82750PB does not per- 
form a pitch calculation, so the pointer remains at 
the address for LO. After the second display line 
(which has the same data as the first line), a 
WRDIGI code is sent to the 82750PB that writes the 
LO data to the LO address and updates the bitmap 
pointer to L1 . The WRDIGINP signal immediately fol- 
lowing this selects the L1 address. After the third 
line of data, two WRDIGINP codes that select 


the L1 address are sent. After the fourth line, (which 
has the same data as the third line) a write operation 
is performed to load L1 data into the L1 address, 
and the 82750PB pointer is updated to address L2. 
A WRDIGINP code Is sent to select the L2 address. 
This same procedure continues for the entire active 
display, until the last active line Is reached (m + 5). 
A final pair of WRDIGI and WRDIGINP or two 
WRDIGINP codes are set to the 82750PB to load in 
the last line of data. At the start of horizontal sync of 
the next line, the FCO signal will be negated. 
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4.0 PROGRAMMING THE 82750DB 


Overview 

All registers are loaded by the issuance of a REGX 
command from the 82750DB to the 82750PB over 
the VBUS. This causes the 82750PB to load a se- 
quence of register values into the VRAM serial out- 
put registers from an address designated by a 
82750DB register pointer. After the request is grant- 
ed, a new 82750DB register word is read in with 
each SCLK. Each 32-bit word consists of a register 
address in the high byte and register values in the 
rest of the word. The sequence is terminated by a 
stop code that corresponds to the address byte be- 
ing equal to Oxff. A variable number of 32-bit words 
can be loaded. During reset, if a stop bit is not found 
within 256 T-cycles, the register transfer is terminat- 
ed, a SHUTDOWN code is asserted on the VBUS, 
and the 82750DB returns to the reset state. All 
transfer requests are terminated at the start of a new 
field. This ensures that non-terminating register 
transfers caused by bad register data will be halted. 

During this register transfer, and on all subsequent 
register transfers (programmed or automatic), the 
82750DB performs a vertical checksum on the regis- 
ter data. The last 32-bit word read in during a regis- 
ter transfer is the user-generated checksum of that 
register data. If the 82750DB-generated checksum 
error does not match the user-generated checksum, 
the 82750DB sends out a SHUTDOWN code to the 
82750PB via the VBUS, and will automatically re-en- 
ter the reset state. 


Pipeline Delay through the 82750DB 


The actual horizontal pipeline delay through the 
82750DB is dependent on processing elements 
used to generate the output. If Y interpolation is not 
used, the pipeline delay is: 

Horiz. Active Pipeline Delay = 16 cycles + 

SCLK Transfer Timing Delay 


Here the SCLK Transfer Timing Delay is 1 for IX, 2 
for 1/2X, and 3 for 1/3X. 

If Y interpolation is used, the pipeline delay is: 

Horiz. Pipeline Delay ^ 16 cycles + 

SCLK Transfer Timing Delay + Integer (Pixel Time) 



The integer (Pixel Time) is simply the integer value 
of the programmed pixel time. The horizontal pipeline 
delay for blanking differs from that of active. When y- 
interpoloation is on or off, the pipeline delay for hori- 
zontal blanking is: 

Horiz. Blanking Pipeline Delay = 10 cycles + 
SCLK Transfer Timing Delay 
The horizontal sync pipeline delay is always equal to 
0 cycles. 


Thus all horizontal parameters, (e.g. horizontal 
blanking start, active stop) must be programmed to 
account for the total horizontal pipeline delay. The 
vertical pipeline delay. The vertical blanking and 
vertical sync pipeline delay are always equal to 0 
lines. All vertical parameters must be programmed 
so that this delay is taken Into account. 
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PROGRAMMING CONSIDERATIONS 

The user must ensure that the 82750DB is pro- 
grammed correctly. Illegal or illogical combinations 
of display parameters are not corrected in hardware, 
and may cause the 82750DB to output erroneous 
display or timing information. The following list high- 
lights some basic guidelines to follow when pro- 
gramming the 82750DB. 

1 . The maximum rate that data may be read into the 
82750DB is determined by the type of memory 
used. This in turn effects the maximum rate and 
depth of data that can be displayed. If 32 bits of 
data can only be read into the 82750DB every 
two clock cycles, only 1 6 bits of data may be dis- 
played every clock cycle. The programmer 
should match the transfer rate (1X, 1/2X, or 
1/3X) with the memory speed, and the display 
pixel rate with the pixel depth and memory band- 
width. 

2. Blanking intervals of the display are defined by 
the non-active programmed time. During this por- 
tion of the display, programmed transfers take 
place. If a transfer does not complete before the 
start of the active display, it is terminated, and 
active display data Is shifted into the 82750DB at 
the programmed rate. During horizontal blanking 
Intervals, the user should allow enough time for 
all programmed register, colormap, and VU data 
transfers to complete. 

3. When digitizing (capturing) Images, no other bit- 
map transfers (e.g., REGX,VU) should be sched- 
uled to occur during the active portion of the field. 

4. Active start and stop times should not be pro- 
grammed to overlap the blanking stop and start 
times, taking the pipeline delay through the 
82750DB into account. 

5. Programming the Y interpolation to occur In a 
non-integral pixel width will cause the Y channel 
to output Incorrect data. 

CURSOR REGISTERS 

The following registers are used to program the 
characteristics of the on-chip cursor. 


Cursor Control Register 0x5a 


31 24 

23 12 

11 0 

I 01011010 

Vertical Position 

Horizontal Position | 


— Horizontal Position in units of T-cycles 

— Vertical Position in units of full lines 


This register also gives the horizontal and vertical 
position of the cursor. The cursor will extend 1 6-pixel 
periods, starting at the prescribed horizontal posi- 
tion, for the next 16 lines. (Or 32-pixel periods for 32 
lines If the 2X Cursor Mode bits in the General Con- 
trol register are set to one.) Receipt of this address 
also causes the 82750DB to interpret the next six- 
teen 32-blt words of register data as the 1 6 x 1 6 x 
2-bit cursor map. This will cause the register address 
decoding logic internal to the 82750DB to be dis- 
abled, and the next 16 words of information will be 
loaded into the Cursor table. Each 32-bit word will be 
interpreted as a line (16 pixels) of cursor data, with 
the two least significant bits corresponding to the 
first cursor pixel to be displayed. 


Cursor Color 3 0x59 


31 24 

23 16 

15 8 

7 0 

j 01011001 

Blue/U Color 

Red/V Color 

Green/Y Color | 


If the cursor Is enabled and the 24 bits of data in this 
register are selected, the data will be sent directly to 
the YUV conversion matrix during active display. The 
bits should be programmed as RGB values when the 
YUV to RGB matrix is not being used. 


Cursor Color 2 0x58 


31 24 

CO 

00 

CVJ 

15 8 

7 0 

1 01011 000 

1 Blue/U Color | 

Red/V Color 

Green/Y Color | 


If the cursor Is enabled and the 24 bits of data in this 
register are selected, the data will be sent directly to 
the YUV conversion matrix during active display. The 
bits should be programmed as RGB values when the 
YUV to RGB matrix is not being used. 


Cursor Position Update Register 0x5b 


CM 

CO 

CM 

CO 

CM 

11 0 

1 01011011 

Vertical Position | 

Horizontal Position | 


— Horizontal Position in units of T-cycles 

— Vertical Position in units of full lines 


This register gives the horizontal and vertical posi- 
tion of the cursor. The cursor will extend 16-pixel 
periods, starting at the prescribed horizontal posi- 
tion, for the next 16 lines. (Or 32-pixel periods for 32 
lines If the 2X Cursor Mode bits In the General Con- 
trol register are set to one. 


Cursor Color 1 0x57 


31 24 

23 16 

15 8 

7 0 

1 01010111 

Blue/U Color 

Red/V Color 

Green/Y Color | 


If the cursor is enabled and the 24 bits of data in this 
register are selected, the data will be sent directly to 
the YUV conversion matrix during active display. The 
bits should be programmed as RGB values when the 
YUV to RGB matrix is not being used. 
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DISPLAY TIMING REGISTERS 

Each register has two, 12-bit components, listed 
with least significant bits first, followed by the 12 
most significant bits. Horizontal timing is measured 
in units of T-cycles (periods of the master clock) 
from the start of horizontal sync. The register con- 
tent defines the number of T-cycles that elapse be- 
fore the event controlled by this register takes place. 
The exception to this rule is the base counter, which 
specifies the number of T-cycles/half line. Zero is 
not an allowable value; use the total number of T-cy- 
cles per half line or full line instead. Unused bits 
should be zero. Sync signals are RESET to initial 
values as specified for each; “start” means to set to 
1 , and “stop” means to be reset to zero. 


Base Counter 0x56 


31 24 

23 12 

11 0 

I 01010110 

# of Lines/Field 

#ofT-Cycles/ Half Lines | 


— T-cycles/Hal Line in units of T-cycles (Periods of the 
master Clock) 

— Half Lines/Field in units of half lines 


As defined by NTSC standards, vertical timing can 
be measured from the start of a field In one of two 
ways: either In units of half lines, or in units of full 
lines. When programmed for an interlaced display, 
(i.e. an odd number of half lines per field) the start of 
a field coincides with the start of a line on odd fields 
and with the midpoint of a line on even fields. In the 
latter case, for an event that Is programmed in full 
lines, the first half line is ignored, and counting be- 
gins with the first full line. With this interpretation, the 
register content defines the number of half or full 
lines that elapse before the event controlled by this 
register takes place. The same may be said for the 
horizontal component, which Is defined by the num- 
ber of T-cycles/half line. The hardware does not 
look for nor correct illogical combinations of register 
settings. The monitor should be protected from dam- 
age with external circuitry when debugging is in 
progress. 

All of the Internal timing is derived from comparing 
the programmed values with the values of this regis- 
ter. The horizontal base counter is programmed us- 
ing the least significant 12 bits. In this case the val- 
ues loaded into this register should be one less than 
the desired value. Bits 23 through 12 are used to 
specify the number of half lines per field. 


Sync Stops 0x55 


CVJ 

CO 

23 12 

11 0 

I 01010101 

VSYNC Stop 

HSYNC Stop 1 

— HSYNC Stop in i 

— VSYNC Stop in i 

Sync Starts 

31 , 24 

jnits of T-cycles 
jnits of half lines 

23 12 

0x54 

11 0 

1 01010100 

VSYNC Start 

HSYNC Start | 


— HSYNC Start in units of T-cycles 

— VSYNC Start in units of half lines 


The Sync Stops and Sync Starts registers are used 
in conjunction with one another to specify the start 
and stop locations of the horizontal sync, HSYNC, 
and vertical sync, VSYNC, output signals. VSYNC 
may be programmed to start and stop at any time 
during a given field as defined on a half-line Interval. 
Bits 23 through 12 in the Sync Starts and Sync 
Stops registers are used to define the start and stop 
times for VSYNC, respectively. Similarly, HSYNC 
may be programmed to start and stop at any line 
position as defined in units of T-cycles. Bits 11 
through 0 in the Sync Starts and Sync Stops regis- 
ters are used to define the start and stop positions 
for HSYNC, respectively. 

The horizontal component of the Sync Stops register 
also affects the composite sync, of CSYNC output. In 
this case, the CSYNC output will be the same as the 
HSYNC output, except during the vertical sync and 
equalization interval. In the latter case, the CSYNC 
output is determined by the Serration and Equaliza- 
tion registers. 


Blanking Stops 0x53 


31 24 

23 12 

11 0 

I 01010011 

Vertical Blank Stop 

Horizontal Blank Stop | 


— HB Stop in units of T-cycles 

— VB Stop in units of half lines 


The Blanking Start and Stop registers control the 
composite blanking output (CB). The horizontal 
blanking start and stop position. In units of T-cycles, 
can be specified to occur at any time during the line. 
By the same token, the vertical blanking start and 
stop positions can be programmed to occur at any 
half-line interval. 
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The CB output combines both the horizontal and 
vertical blanking pulses programmed using these 
two registers. This information is independent from 
the HSYNC, VSYNC, and CSYNC outputs, so the 
user must specify the proper blanking intervals for 
the monitor that is being used. If the programmer 
specifies the blanking period to end before the ac- 
tive line starts, or start after the active line has end- 
ed, the border color is output. Due to internal pipe- 
line delays on the 82750DB, the values should be 
one less than desired for VB Start and Stop. For HB 
Start and Stop subtract the total horizontal pipeline 
delay. 


Blanking Starts 0x52 


CO 

ro 

23 12 

11 0 

1 01010010 

Vertical Blank Start 

Horizontal Blank Start] 


— HB Start in units of T-cycles Resets to 1 

— VB Start in units of half lines Resets to 1 


Program values one less than desired for VB Start 
and Stop. For horizontal blanking start, load num- 
bers less than the total horizontal pipeline delay. 


Serration Start 0x51 


31 24 

23 12 

11 0 

1 01010001 

Not Used 

Serration Start | 


~ SER Start in units of T-cycles Resets to 0 
— (not used) 


The vertical component of the CSYNC (composite 
sync) signal is made up of two types of pulses: 
equalization and serration pulses. The windoyv dur- 
ing which the serration pulses are active, is deter- 
mined by the VSYNC start and stop positions, as 
shown in Figure 4-1 . When vertical sync (VSYNC) is 
active, In this case on line 3, the first serration pulse 
is output on the CSYNC signal. This pulse will start 
at the T-cycle count specified in Bits 11 to 0 of the 
Serration Start register. The pulse will end when the 
half-line count specified in the Base Counter register 
has been reached. This pulse will be repeated for 
every half line that the VSYNC output is pro- 
grammed to be active, regardless of the position in 
the field. In Figure 4-1, this continues until half line 
12, or line 6. 


Pre-Equalization 

Pulses Serration Pulses Post Equalization „ 

^ ^ 

Stan Of Odd Field 

A 


IHorizontal Equalization Stop 


Vertical Serration Start 

A 


Vertical Equalization Stop 

A 


CSYNC 




VSYNC 


Line Count 


Figure 4-1. Programming the Video Sync Outputs 
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Equalization Parameters 0x50 


31 24| 

23 12 

11 0 

o 

o 

o 

o 

o 

o 

Vertical Equalization Stop 

Horizontal Equalization Stop| 


— EQH Stop in units of T-cycles Resets to 1 

— EQV Stop in units of half lines Resets to 1 


During the vertical equalizing period, which starts at 
field-beginning, an equalization pulse is output on 
the CSYNC signal at the beginning of each half line, 
as shown In Figure 4-1. The width of this equaliza- 
tion pulse is determined by the value in bits 11 to 0 
of this register. The half line on which these pulses 
are to stop is programmed in bits 23 through 12 of 
this register. If VSYNC is programmed to occur dur- 
ing the equalization interval (as it is for NTSC type 
displays), the serration pulses are output on the 
CSYNC signal. 


Active Region Stops 0x4f 


31 24 

23 12 

11 0 

1 01001111 

Vertical Active Stop 

Hprizontal Active Stopj 


— Actdis Stop in units of T-cycles 

— Vertical Stop in units of full lines 


The active region window, during which pixels to be 
displayed are fetched from VRAM, is defined by the 
Active Region Start and Stop registers. The first dis- 
play line is actually five lines after the line indicated 
in the vertical region of the Active Region Start regis- 
ter. The position of the active region on a horizontal 
line is determined by the horizontal component of 
the Active Region Start register. Pixels will be 
fetched from VRAM at a rate determined by the 
number of bits/pixel and pixel widths. In order for the 
82750DB to operate properly, the horizontal width of 
the active region window must be an integral number 
of display pixel widths, taking into account the hori- 
zontal pipeline delay. Also, the Active Region Start 
and Stop must fall within a single line boundary, as 
dictated by the Base Counter register. When the first 
pixel actually appears at the output of the 82750DB, 
the output is a function of the processing elements 
used as discussed above. 

When the active region Is over, the border color is 
output until the programmed blanking time is 
reached. Both the border and blanking information is 
output at the transfer rate programmed by the user. 


Active Region Starts 0x4e 


31 24 

23 12 

11 0 

1 01001110 

Vertical Active Start 

Horizontal Active Start| 

— Actdis Start in ur 

— Vertical Start in i 

Burst Gate Stop 

31 24 

lits of T-cycles 
jnits of full lines 

23 12 

0x4d 

11 0 

1 01001101 

Vertical BG Stop 

Horizontal BG Stop | 


— Horizontal Stop Position in units of T-cycles 

— Vertical Stop Position in units of full lines 


The Burst Gate Horizontal and Vertical Start and 
Stop registers allow the user to program a window 
into which burst can be added. This is useful when 
modulating the outputs of the 82750DB. 


Burst Gate Start 0x4c 


31 24 

23 12 

11 0 

1 . 01001100 

Vertical BG Start 

Horizontal BG Start | 


— Horizontal Start Position in units of T-cycles 

— Vertical Start Position in units of full lines 


VBUS CODE REGISTERS 

The following group of registers are used by the pro- 
grammer to schedule when VBUS transfer or control 
codes are to be sent to the 82750PB by the 
82750DB. 


Display Format Load Interrupt 0x4b 


31 24 

23 12 

11 0 

1 01001011 

Vertical DFL Position 

Horizontal DFL Position| 


— Horizontal Position in units of T-cycles 

— Vertical Position in units of full lines 


This is the programmable XY Interrupt, used by the 
82750PB to perform a load of the Shadow Copy reg- 
isters. This Interrupt is sent on the VBUS when the 
bits 23 to 12 match the current display line position, 
and bits 11 to 0 match the T-cycle count. 








82750DB 


iny. 


Line Notification Timing 0x4a Alpha Register 


0x47 


31 24 

23 12 

11 0 31 24 

23 16 

15 8 

7 0 

I 01001010 

Not Used 

Horizontal HLIN Position { joiOOOIII 

Border Alpha 

Alphal Register 

AlphaO Register | 


— HLIN timing in units of T-cycles 

— Not Used 

This indicates the position on each line to send a 
HLINE code on the VBUS. The 82750PB requires 
this information to keep track of the current display 
line when drawing graphics. 


The least significant 8 bits are for the ALPHAO regis- 
ter and are used during blanking and if the alpha trap 
value is not matched. The rtext 8 bits are for the 
ALPHA 1 register whert the alpha trap value is 
matched. The most significant 8 bits provide the al- 
pha channel value during the border time. 


Refresh and Register Transfer 0x49 


31 24 

23 12 

11 0 

1 0.1 001 00 1 

REGX Line Number 

Refresh Horizontal Position j 


— REFRESH horizontal timing in units of T-cycles 

— Register Transfer Line number in units of full lines 


When the T-cycle count matches the value pro- 
grammed Into bit 11 to 0 of this register, a refresh 
code is sent to the 82750PB. Since these codes tie 
up the 82750PB for at least eight 82750PB cycles, 
the programmer must ensure that no transfer re- 
quests are scheduled to occur during this time. 

The line number for the next register transfer is 
specified in bits 23 to 12 of this register. If pro- 
grammed to occur, REGX will always be the first 
transfer request sent to the 82750PB, immediately 
after the end of active display. 


COLOR REGISTERS 

The following registers specify the state of DBU, 
DRV, DGY, and ALPHA signals during the field. 


Border Color 0x48 


31 24 

23 16 

15 8 

7 0 

1 01001 000 

Blue/L) Color 

Red/V Color 

Green/Y Color | 


The 24 bits of data in this register are sent directly to 
the YUV conversion matrix during border time. Bor- 
der time is defined as the region In which neither 
active display nor blanking is programmed to occur. 
The bits should be programmed as RGB values 
when the YUV to RGB matrix is not being used. 


Blanking Color 0x46 


31 24 

23 16 

15 8 

7 0 

1 01 0001 1 0 

Blue/U Color 

Red/V Color 

Green/Y Color | 


The 24 bits of data In this register are sent directly 
through the YUV conversion matrix during the pro- 
grammed blanking time. 


CONTROL REGISTERS 

The following registers are used to define the oper- 
ating modes of the 82750DB. 


Pixel Control 0x45 

23 22 21 19 18 14 13 11 10 9 8 7 6 0 



Bits 6:0— SCLK Delay 

The number “m” of T-cycles from initiation of a 
transfer request on the VBUS until the first SCLK is 
asserted by the 82750DB. 
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Bit 7--VU Interpolation Round 

When equal to 0, this bit means truncate during in- 
terpolation. When set to one, this bit means round to 
pdd during interpolation. 

Bit 8 — Conditional Interpolation Enable 

When reset to zero, this bit means all values of Y 
and U are a full 8 bits of precision. When set to one, 
this bit means the least bit of the Y sample or the U 
sample controls the switching between VU interpola- 
tion and graphics mode. 

Bit 9— VU Interlace Enable 

Setting this bit to a one causes the interpolator to 
output different data on the odd and even fields. 
During the odd field, the odd lines of the interpola- 
tion sequence will be output. During the even field, 
the even lines of the interpolation sequence will be 
output. Full lines of the programmed number of sam- 
ples of both the V and U data will be read in during 
each VU transfer. Setting this bit to a zero will cause 
horizontally and vertically interpolated data to be 
output on both fields. Only a full line of either V or U 
samples will be read in during each transfer request 
in this mode. 


Bit 10— 4X VU Expand 

When this bit is set to a zero, a 2X expansion in both 
directions is performed. By setting this bit to a one, a 
4X expansion is performed. 

Bits 13:11 — VU Sample Select 

Table 4-1 provides the code and number of V and U 
samples for bits 13: 11. 


Table 4-1. VU Sampling 


Code 

Number of V And U Samples 

000 

0 Samples for Each V and U 

111 

32 Samples for Each V and U 

110 

64 Samples for Each of V and U 

101 

96 Samples for Each of V and U 

100 

128 Samples for Each of V and U 

oil 

1 60 Samples for Each of V and U 

010 

1 92 Samples for Each of V and U 

001 

256 Samples for Each of V and U 


Bits 18:14— Pixel Time 

Table 4-2 lists the codes and pixel duration for bits 
18:14. 


Table 4-2. Pixel Times 


Code 

Duration of Pixel 

00001 

1.0 T-cycle 

00010 

1.5 T-cycles 

00100 

2.0T-cycles 

01000 

2.5 T-cycles 

10000 

3.0 T-cycles 

10001 

3.5 T-cycles 

10010 

4.0 T-cycles 

10100 

4.5 T-cycles 

11000 

5.0 T-cycles 

11001 

5.5 T-cycles 

11010 

6.0 T-cycles 

11100 

6.5 T-cycles 

11101 

7.0 T-cycles 

11110 

7.5 T-cycles 

00011 

8.0 T-cycles 

00101 

8.5 T-cycles 

00110 

9.0 T-cycles 

' 00111 

9.5 T-cycles 

01001 

10.0 T-cycles 

01010 

10.5 T-cycles 

01011 

11.0 T-cycles 

01100 

11.5 T-cycles 

01101 

12.0 T-cycles 

01110 

13.0 T-cycles 

01111 

14.0 T-cycles 
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Bits 21:19— Bits/Pixel 

Table 4-3 provides the code and number of bits/pix- 
el for bits 21:19. 


Table 4-3. Number of Bits/Pixel 


Code 

Number of Bits/Pixei 

001 

8 

010 

16 

100 

32 


Bit 22— VU Pixel Replicate 

When set to one, each pixel generated by the VU 
Interpolator is held for 2-pixel times. This allows an 
effective 8X expansion of VU data. This is useful for 
high resolution applications where the blanking time 
is not sufficient to support higher VU sample loads. 

Bit 23— Pseudo 16-Bit Mode 

When set to one and 1 6 bits per pixel is chosen (bits 
21:19), the 82750DB is in the 16-bit with Alpha 
mode. Setting this signal to zero while in the 16-bit/ 
pixel mode puts the 82750DB into the 16-bit (655) 
mode. This bit represents a “don’t care’’ input for all 
other values of bit/pixel. 


Bit 6 — 2X Horizontal Cursor 

When this bit is set to one, and the Cursor Enable bit 
is set to one, every pixel on each line of the cursor 
will be replicated once. Thus a cursor that was 
16x16 pixels will become 32 x 16 pixels. 

Bit 7— 2X Vertical Cursor 

When this bit is set to one, and the Cursor Enable bit 
is set to one, each line of the cursor will be replicat- 
ed once. Thus a cursor that was 16x16 pixels will 
become a 1 6 x 32-pixel cursor. 

Bit 9:8— Channel Select 

These two bits control which output channel is 
muxed onto the alpha digital outputs. It allows Y, U, 
or V data to be available at the alpha channel. The 
coding is provided in Table, 4-4. 


Table 4-4. Test Mode Select Coding 


Code 

Alpha Channel Output 

00 

Alpha Channel 

01 

Y Channel 

10 

V Channel 

11 

U Channel 


General Control 


0x44 


Bit 10— Sync Test 


17 16 13 12 11 10 9 8 7 6 5 4 


y Cun 


i 

Burst Multiple 
Cursor Enable 
2x Horizontal Cursor 
2x Vertical Cursor 
Channel Test Select 
Sync Test 
Gren 


240855-18 


This bit must be set to zero for proper operation. 


Bit 11— Gren 

This is the Graphics Enable bit for the Y Interpolator. 
When this bit is set to one and the pixel Is a graphics 
pixel, switch Is zero, a 2X interpolation will be per- 
formed on the pixel. 


Bits 4:0 — Burst Multiple 

These bits are used to program a divisor of the 
FREQIN clock input in order to recover the 
3.58 MHz NTSC color subcarrier. The programmed 
value is the two’s complement of the desired divisor. 
The allowed range of values is 00001 through 11111 
which corresponds to divisions of 31 through 1. Note 
that the 82750DB must be operating at an Integer 
multiple of 3.58 MHz for this to work effectively. 

Bit 5— Cursor Enable 

When set to one, the hardware cursor will output the 
cursor data at prescribed Intervals if programmed to 
do so. 


Bit 12— Viden 

This is the Video Enable bit of the Y Interpolator. 
When this bit is set to one and the pixel is a video 
pixel, switch Is one, a 2X interpolation will be per- 
formed on the pixel. 

Bit 16:13— Vblen 

These bits program the T-cycle length of each VBUS 
code. The VBUS code length will be one T-cycle 
longer than the programmed value. These bits must 
have a minimum value of 2, and a maximum value of 
15. 
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Miscellaneous Control 0x43 Bit 12-- Alpha Enable 


23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 0 



Video Pass 


240855-19 

Bits 7:0 — Alpha Trap 

Bits 7:0 are 8-bit values used for comparison with 
the current pixel’s Y value, to select one of two pro- 
grammable alpha values. 

Bit 8 — Alpha Trap Select 

A value of one enables the Y value of the current 
pixel to be compared with the value in the Alpha 
Trap register. If the two values match and Alpha has 
been enabled via the Alpha Enable bit, the contents 
of the ALPHA1 register are output on ALPHA[7:0]. If 
the two values don’t match and Alpha Enable has 
been set to one, the content of the ALPHAO register 
is output. When Alpha Trap Select is set to a zero in 
the pseudo 1 6- or 32-bit mode, the most significant 
byte of the pixel word is output. When Alpha Trap 
Select Is set to zero in all other modes, the value of 
the ALPHAO register is output. 

Bit 9— Border Alpha Enable 

A value of one enables the eight most significant bits 
in the ALPHA register to be output. When set to a 
zero, the ALPHAO register is output during border 
time. 

Bit 10— Digitize Enable 

When this bit Is set to a one, the FCO signal will be 
set to a one, and the transfer codes for bitmaps will 
indicate that write operations should occur. 

Bit 1 1 — VU Interpolator Output Enable 

This bit enables VU interpolation data to be dis- 
played. When set to a zero, all pixels are treated as 
graphic pixels. 


When set to one, the alpha output is governed by 
the alpha trap value, as described above. When re- 
set to zero, the contents of the ALPHAO register Is 
the alpha output in the 8- and 16-bit modes, and the 
explicit ALPHA data encoded in the pseudo 16- and 
32-bit modes. 


Bit 13— Switch on LS Bit of Y 

When set to one, the least significant bit of Y is used 
as a Video/Graphics switch in all modes. When re- 
set to zero, the least significant bit of U from the 
interpolator acts as a switch. 

Bit 14— Genlock Enable 

This bit enables the genlock mode of the 82750DB. 
In this mode, receipt of the external HRESET # sig- 
nal during the second half of a scan line will cause 
the termination of that scan line. Similarly, receipt of 
the externally produced VRESET # signal will termi- 
nate the field. In both cases, terminate denotes that 
the proper on-chip signals are produced to signify 
end of the line and end of the field. 


Bit 15— Bypass Conversion Matrix 

When this bit is set to a one the YUV to RGB matrix 
will be bypassed, and the Y, U, and V data will feed 
directly into the Digital to Analog Converters. 

Bit 16— Split GLUT 

This bit divides the GLUT Into an odd and an even 
half, depending on the polarity of the Video/Graph- 
ics switch. This switch is selectable and may be ei- 
ther the LSB of U from the Interpolator or Y from the 
pixel word. The LSB of the GLUT address is set to 
one (odd address) if the Video/Graphics switch is 
one; the LSB of the GLUT address is set to zero 
(even address) if the Video/Graphics switch is zero. 

Bit 17— Graphics Pass 

Setting this bit to a one bypasses the GLUT for 
graphics pixels, even in non-mixed modes. 

Bit 18— Video Pass 

When set to a one, all video pixels (luminance val- 
ues associated with sub-sampled UV values) will by- 
pass the color table. For mixed modes, this corre- 
sponds to the switch flag having a value of one. 
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Bit 20:19 — Transfer Timing Select 

These bits are two-bit codes that select one of three 
possible transfer shift clock rates. This allows the 
operating speed of the 82750DB to be tailored to the 
external memory access time. After RESET, the 
transfer rate is set to the slowest possible clock rate 
(1 /3X). The programmed rate is used during all non- 
active display times for transferring data from 
VRAMs. It also defines the rate that the border and 
blanking data is output. During active display, the 
data is read as needed from VRAM using the pro- 
grammed timing. The coding of these bits is listed in 
Table 4-5. 


Table 4-5. Coding of Transfer Timing Select Bits 


Bit 20 

Bit 19 

Result 

0 

0 

1 /3X Transfer (Default) 

0 

1 

1/2X Transfer 

1 

0 

IX Transfer 


Bit 21— 82750DB Enable 

When set to zero, the 82750DB will be the register 
equivalent of a 82750DA. When set to a one all the 
features of the 82750DB will be enabled. 


Bit 22 — Line Repiicate Enable 

When this bit is set to one, every line in the active 
display Is generated twice. Each new bitmap transfer 
occurs at half the line rate, with a new VBUS code 
being used to indicate that a transfer Is to take place 
without the pitch calculation. The VU Interpolator will 
also duplicate the lines it generates, yielding more 
time between transfer cycles. This mode is useful for 
obtaining a 2X increase In vertical resolution without 
the need for increasing the VRAM transfer band- 
width. 


COLOR MAP REGISTERS 

The following registers are used to access and con- 
trol the three 256 x 8-bit Color Lookup Tables. 


Mask Data Registers 0x42 


31 24 I 

23 16 

15 8 

7 0 

o 

o 

o 

o 

o 

o 

Blue/U Mask Data 

Red/V Mask Data 

Green/ Y Mask Data| 


Each of the three 8-bit registers contains the bit pat- 
tern used when the corresponding bit In the Mask 
Set register is asserted. 


Mask Set Registers 0X41 


31 24 

23 16 

15 8 

7 0 

I 01 000001 

Blue/U Color 

Red/V Color 

Green/Y Color | 


This is a 24-bit register that contains the mask bit 
pattern for the RGB/YUV color map addresses. 
When a bit in this register Is asserted, the corre- 
sponding bit in the address is set to the value de- 
fined in the Mask Data registers. 


CLUT Index Register - 0x40 


31 24 

23 16 

15 8 

7 0 

1 01 000000 

Not Used 

Not Used 

YUV GLUT Indexj 


The CLUT Index register is an 8-blt register used for 
loading the color tables. This register maps the user- 
specified 6-bit color map address into an 8-bit ad- 
dress. A logical OR operation is performed between 
the 6-bit address and the 8-bit index word to obtain 
the new CLUT address. 


Color Lookup Table Addresses 0x00-0x3f 

If the 82750DB Enable mode bit in the Miscellane- 
ous Control register is set to zero, the CLUT ad- 
dresses are decoded to appear as addresses to the 
reduced-size 82750DA color table. The least signifi- 
cant four bits of the address are used for the Y color 
table address, and the upper nibble is used to ad- 
dress the V and U color table simultaneously. This is 
a compatibility mode for the 82750DA, which has a 
reduced-size color table. 


31 28 

27 24 

(O 

CO 

CVl 

15 8 

7 0 

1 UV Address 

Y Address 

1 U Data 

VData 

Y Data [ 


If the 82750DB Enable mode bit is set to one, the full 
color table is used. In this case, the most significant 
byte of the 32-bit data word is used as an address to 
the color table. The address is ORed with the most 
recently loaded CLUT Index register. 


31 30 

29 24 

23 16 

15 8 

7 0 

1 0 0 

YUV Address 

U Data 

V Data 

Y Data | 
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82750DB Register Summary 

The following table illustrates the register space of the 82750DB. 

Table 4-6. 82750DB Register Space 


Address 

82750DB Register 

0x00 -OxOf 

GLUT Locations 0-15 

0x10-0x30 

GLUT Locations 16-48 

0x31 

GLUT Location 49 

0x32 

GLUT Location 50 

0x33 

GLUT Location 51 

0x34 

GLUT Location 52 

0x35-0x37 

GLUT Location 53-55 

0x38 

GLUT Location 56 

0x39-0x3f 

GLUT Location 57-63 

0x40 

GLUT Index Register 

0x41 

GLUT Mask Set Register 

0x42 

GLUT Mask Data Register 

0x43 

Miscellaneous Gontrol 

0x44 

General Gontrol 

0x45 

Pixel Gontrol 

0x46 

Blanking Color 

0x47 

Alpha Register 

0x48 

Border Color 

0x49 

Register Transfer 

0x4a 

Line Notification and Timing 

0x4b 

DFL Load 

0x4c 

Burst Gate Start 

0x4d 

Burst Gate Stop 

0x4e 

Active Region Start 

0x4f 

Active Region Stop 

0x50 

Equalization Parameters 

0x51 

Serration Start 

0x52 

Blanking Start 


Address 

82750DB Register 

0x53 

Blanking Stop 

0x54 

Sync Start 

0x55 

Sync Stop 

0x56 

Base Counters 

0x57 

Cursor Color 1 

0x58 

Cursor Color 2 

0x59 

Cursor Color 3 

0x5a 

Cursor Control 

0x5b 

Not Used 

0x5c 

Not Used 

0x5d 

Not Used 

0x5e 

Not Used 

0x5f 

Not Used 

0x60 

Not Used 

0x61 

Not Used 

0x62 

Not Used 

0x63 

Not Used 

0x64 

Not Used 

0x65 

Not Used 

0x66 

Not Used 

0x67 

Not Used 

0x68 

Not Used 

0x69-0x6e 

Not Used 

0x6f 

Not Used 

0x70 

Not Used 

0x71 -0x7f 

Not Used 

0x80 -Oxfe 

Not Used 

Oxff 

Stop Code 
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5.0 ELECTRICAL DATA 
Maximum Ratings 

Table 5-1 is a stress rating only, and functional operation 
at the maximums is not guaranteed. Functional operat- 
ing conditions are given in the DC and AC Characteris- 
tics (Tables 5-2, 5-3, 5-4, and 5-5). 


Exposure to the Maximum Ratings may affect device 
reliability. Furthermore, although the 82750DB con- 
tains protective circuitry to resist damage from static 
electrical discharge, always take precautions to 
avoid high static voltages or electric fields. 


Table 5-1. Absolute Maximum Requirements 


Condition 

Maximum 

Requirement 

Case Temperature under Bias 

-65‘’Cto110‘’C 

Storage Temperature 

-65°Cto110'C 

Voltage on Any Pin with Respect to Ground 

-0.5VtoVcc + 0.5V 

Supply Voltage with Respect to Vss 

-0.5V to +6.5V 


DC Characteristics 

Table 5-2. DC Characteristics = 5V ±10%, T^^se = to 95°C 


Symbol 

Parameter 

Min 

Typ 

Max 

Unit 

Notes 

V,L 

Input LOW Voltage 

-0.3 



V 


V,H 

Input HIGH Voltage 

2.0 


' Mdc 0.0 



VoL 

Output LOW Voltage 



0.4 



Mdh 

Output HIGH Voltage 

- 2.4 




IBSBBDEISSIlliil 

ML 

Input Leakage Current 

-10 


-F10 



•oz 

Output Leakage Current 



+10 

HA 


‘CCT 

Power Supply Current 


f^85 

250 

mA 

28MHz^2) 

'CCNT 

Power Supply Current . 



190 

mA 

28 MHz«^) 

•CCT 

Power Supply Current 



375 

mA 

45 MHz(2) 

*CCNT 

Power Supply Current ^ 


wsm 


mA 

45 MHz'’' ' 

^IN 

Input Capacitance 



10.0 


Fc = 1 MHz<^) 

^OUT 

Output Capacitance 



12.0, 


Fc = 1 MHz^'^J 

^FREQIN 

FREQIN Input Capacitance 



20.0 


Fc = 1 MHz(^> 


NOTES: 

1 . Measured with FREQIN = 7 MHz. 

2. Typical current value measured under typical conditions with the Digital Outputs (DGY, DRV, and DBU) toggling. Maximum 
current value guaranteed with 50 pF maximum output loading. Analog Outputs disabled. 

3. Typical current value measured under typical conditions with the Digital Outputs (DGY, DRV, and DBU) not toggling. 
Maximum current value guaranteed with 50 pF maximum output loading. Analog Supply Current lACC not included. 

4. Not 100% tested. 
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AC Characteristics 


Table 5-3. AC Characteristics at 28 MHz = 5V ±10%, Tcase = 0°C to 95°C, = 50 pF 


Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 


Frequency 

7 

28 

MHz 


IXCIock 

ti 

FREQIN Period 

35 

140 

ns 

5-1 


^2 

FREQIN High Time 

12 

23 

ns 

5-1 

(Note 1) 

>3 

FREQIN Low Time 

12 

23 

ns 

5-1 

(Note 1) 

‘4 

FREQIN Fall Time 


4 

ns 

5-1 


^5 

FREQIN Rise Time 


4 

ns 

5-1 


tea 

HSYNC, VSYNC, CSYNC, BG, 

FCO Valid Delay 


24 

ns 

5-2 


^6b 

VBUS[3:0] Valid Delay 


26 

ns 

5-2 



RESETB#, VRESET#, HRESET#, 
DISDIG, TESTACT Setup 

0 


ns 

mi 


‘e 

RESET #, VRESET#, HRESET#, 
DISDIG, TESTACT Hold 

13 


ns 

wm 


‘9 

SCLK[1:0] Valid Delay High 


14 

■Jm 

mEEHl 

IX Mode 

^10 

SCLK[1:0] Valid Delay Low 


1/2t,;ft.4;' 

ns 

• 5-4 

IX Mode 

‘11 

SCLK[1:0] Valid Delay 


tjCu 


5-5, 5-6 

1/2X, 1/3X Mode 

^2 

DATAIN[31:0] Setup 



..ns 

5-4, 5-5, 5-6 


BSilli 

DATAIN[31:0] Hold 

5 . .. 


; hs 

5-4, 5-5, 5-6 



PIXCLK Valid Delay 



ns 

5-7 

(Note 2) 

‘15 

PIXCLK Valid Delay 



ns 

5-7 

(Note 3) 

^16 

DRV[7;0], DGY[7:0i. DBUPrO], 
ALPHA[7:(5. ACTDIS, CB, BPP{Q3, 
BPP[1} Output Setup 




00 


tl7 

DRVp’tO], DQY{7;0], DBU[7;0], 
ALPHA[7:03fA€TDlS, CB, BPP[0], 
BPP[1] Output Hold 

15 


ns 

5-8 



VBUS[3.0], SCLK[1 .0], FCO. 

HSYNC, VSYNC, CSYNC, CB, BG, 
PIXCLK, DRV[7:0], DGY[7:0], 
DBU[7:0]. ALPHA[7:0], ACTDIS, 
BPP[0], BPP[1] Float Delay 

1 

30 

ns 

5-9 

(Note 4) 

^19 

DISDIG, DRV[7:0], DGY[7:0], 
DBU[7;0], Digital Output 

Disable Delay 

31, 


ns 

5-10 


^20 

DISDIG, DRV[7:0], DGY[7:0], 
DBU[7;0], Digital Output 

Enable Delay 

3t, 


ns 

5-10 


^21 

DISDAC, RV, GY, BU Analog 

Output Disable Delay 


19 

ns 

5-11 

(Note 6) 

^22 

DISDAC, RV, GY, BU Analog 

Output Enable Delay 


19 

i 

ns 

5-11 

(Note 6) 
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NOTES: 

1 . This assumes a 35 ns period. For other speeds, the FREQIN High and Low Times should fall within a 40% to 60% duty 
cycle. 

2. For integer pixel times is the Valid Delay on all assertions of PIXCLK during active display time. 

3. For non-integer pixel times t^g is the Valid Delay on alternating assertions of PIXCLK during active display time. 

4. Not 100% tested. 

5. All A.C. specifications are measured at the 1 .5V crossing point with a 50 pF load. 

6. Analog output delay is measured at the 50% level of the full scale transition with Rl = 75Q and Cl = 25 pF. 


AC Characteristics 


Table 5-4. AC Characteristics at 45 MHz Vgc = 5V ±10%, Tc^se = 0°C to 95°C, Cl = 50 pF 


Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 


Frequency 

7 

45 

MHz 


IXCIock 


FREQIN Period 

22 

140 

ns 

5-1 


‘2 

FREQIN High Time 

7 

15 

131 

5-1 

(Note 1 ) 

^3 

FREQIN Low Time 

7 

15 

IQI 

5-1 

(Note 1) 

t. 

FREQIN Fall Time 


4 

"WA. 

5-1 


ts 

FREQIN Rise Time 


4 1 

ns 

5-1 


^6a 

HSYNC, VSYNC, CSYNC, BG, - 
FCO Valid Delay 


20^"^-^ 

ns . 

5-2 


*6b 

VBUS[3:0] Valid Delay 


ae/ 

m 

5-2 


■1 

RESETB#, VRESET#, HRESET#, 
DISDIG, TESTACT Setup 

H 

. '■ V • /■ 

ns 

5-3 


m 

RESET B#, VRESET#, HRESET#, 
DISDIG, TESTACT Hold 

1^ 


ns 

5-3 


^9 

SCLK[1:0] Valid Delay High 

^1 


ns 

5-4 

IX Mode 

^10 

SCLK[1:0] Valid Delay Low 


,1/2tj^12 

ns 

5-4 

IX Mode 

‘11 

SCLK[1:0] Valid Delay ’ ' 


'12 

ns 

5-5, 5-6 

1/2X, 1/3X Mode 

^2 

DATAIN[31:0] Setup 

m 


ns 

5-4, 5-5, 5-6 


*13 

DATAIN[31:0]Hold 

3 


ns 

5-4, 5-5, 5-6 


*14 

PIXCLK Valid Delay ' j 


1/2t^ +20 

ns 

5-7 

(Note 2) 

tl5 

PIXCLK Valid Delay ^ , 


20 

ns 

5-7 

(Note 3) 

*16 

DRV[7:0], DGY[7;0i, 0BU{7:0], 
ALPHA[7:0], ACTOfS, CB, BPP[0], 
BPP[1]A/UGR Output Sefdp 

0 


ns 

5-8 

■ 


tl7 

DRV[7:0], DGY[7:0], DBU[7;0], 
ALPHA[7:0], ACTDIS, CB, BPP[0], 
BPP[1]A/UGR Output Hold 

10 


. hs 

5-8 


*18 

VBUS[3.0], SCLK[1.0], FCO, 

HSYNC, VSYNC, DRV[7;0], 
DGY[7:0], ALPHA[7:0], ACTDIS, 
BPP[0], BPP[1]A/UGR Float Delay 


30 

ns. 

5-9 

(Note 4) 

*19 

DISDIG, DRV[7:0], DGY[7:0], 
DBU[7:0], Digital Output 

Disable Delay 

3ti 


ns 

5-10 
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AC Characteristics (Continued) 


Table 5-4. AC Characteristics at 45 MHz Vqq = 5V ±10%, = 0°C to 95°C, Cl = 50 pF 


Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

*20 

DISDIG, DRV[7:0], DGY[7:0], 
DBU[7:0], Digital Output 

Enable Delay 

3ti 


ns 

5-10 


*21 

DISDAC, RV, GY, BU Analog 

Output Disable Delay 


19 

ns 

5-11 

(Note 6) 

*22 

DISDAC, RV, GY, BU Analog 

Output Enable Delay 


.19 

ns 

5-11 

(Note 6) 


NOTES: 

1. This assumes a 22 ns period. Fqhother speeds, the FREQIN High and Low Times should fall within a 40% to 60% duty 
cycle. 

2. For integer pixel times t .|4 is the Valid Qelay on all assertions of PIXCLK during active display time. 

3. For non-integer pixel times t^g is the Valid Delay on alternating assertions of PIXCLK during active display time. 

4. Not 1 00% tested. 

5. All A.C. specifications are measured at the 1 .5V crossing point with a 50 pF load. 

6. Analog output delay is measured at the 50% level of the full scale transition with Rl = 75Q and Cl = 25 pF. 
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Figure 5-7. PIXCLK Waveforms 


240855-26 



Figure 5-8. Output Setup and Hold 


240855-27 



Figure 5-9. TEST ACT # Float Delay 
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INDICATES HIGH-IMPEDANCE STATE 


Figure 5-1 1. DISDAC to Analog Output Delay 


Digital to Analog Converter Electrical Characteristics 


Table 5-5. DAC D.C. Characteristics AVqq = 5V ±10%; == 0°C to +95°C 


Symbol 

Parameter 

Min 

Typ 

Max 

Unit 

Notes 

Iref 

Reference 

Current 



1500 

pA 


Ifs 

Output Current* 
(Full Scale) 

0.93 * (255/18.5) * Iref 


1.07 *(255/1 8.5)*, Iref 

mA 

(Note 1) 

Vfs 

Output Voltage 
(Full Scale) 


1.0 

1.5 

V 


INL 

Integral 

Nonlinearity 


1.0 , 

±3 

LSB 


DNL 

Differential 

Nonlinearity 



±1 

LSB 


lACC 

Analog Supply 
Current 



3 * Ifs + 8 

mA 

(Note 2) 

DDTR 

DAC to DAC 
Tracking at Full 
Scale 


2,0 

5.0 

% 

(Note 3) 

Cout 

Output 

Capacitance 



12 

PF 

(Note 4) 


NOTES: 

1 . Maximum Its allowed = 22 mA. 

2. Maximum lACC allowed = 74 mA. Typical value of lACC = 3 * Its + 6 

3. Maximum deviation between RV, GY and BU outputs at fullscale output voltage. 

4. Not 100% tested. 

5. All DAC testing done with Iref = 1500 pA. 1-50 
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Table 5-6. DAC A.C. Characteristics 


Symbol 


Parameter 

Rise/Fali Time 
Clock Feedthrough 


Output Skew 
Crosstalk 


pV-sec 


pV-sec 


(Note 1) 
(Note 2) 


(Notes 2, 3) 


(Note 2) 


NOTES: 

1 . Maximum value is for = 75Q and Cl = 25: pF. D0llned:es?1O% to 90% fullscale transmission. 

2. Assumes an 80 MHz filter on output. 

3. Glitch energy generated from the infk|i^hce that 2 active outputs have on an idle output. 

4. DISDIG must be tied high. 

5. Assumes the use of 0.1 pF capacitor between VGCS and AV^^ and 0.1 pF and 10 pF capacitors between I REFIN and 





Rl = Load Resistance 
Cl =0.1 nF 
C2 = 10mF 

Cl = Load Capacitance 


"•= 1 ^ 


0 < lout < Ifs 


0 <Vout < Vfa 


Tr = Tf «3 *Rl(Cl+Cout) 


Figure 5-12. Typical Output Configuration 
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Output Delay and Rise Time versus Load Capacitance 



Figure 5-13. Typical Output Valid Delay versus Load Oapacitance under Worst Case Conditions 





6 

5 

Rise 4 

Time (ns) 

0.8V - 2.0V 3 

2 

1 

















V 




















25 50 75 100 125 150 

Cl (picoferads) 240855-31 

NOTE: 

This graph will not be linear outside of the Cl range shown. 


Figure 5-14. Typical Output Rise Time versus Load Capacitance under Worst Case Conditions 
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6.0 MECHANICAL DATA 


Packaging Outlines and Dimensions 

Intel packages the 82750DB in a Plastic Quad Flat 
Pack (PQFP). Table 6-1 gives the symbol list for the 
PQFP. 


Table 6-1. PQFP Symbol List 


Letter or 
Symbol 

Description of Dimensions 

A 

Package Height: Distance from 
Seating Plane to Highest Point of 
Body 

A, 

Standoff: Distance from Seating 
Plane to Base Plane 

D/E 

Overall Package Dimension: Lead 
Tip to Lead Tip 

D,/E, 

Plastic Body Dimension 

D 2 /E 2 

Bumper Distance 

D 3 /E 3 

Footprint 

D 4 /E 4 

Foot Radius Location 

L, 

Foot Length 

N 

Total Number of Leads 


The PQFP has the following specifications: 

1. All dimensions and tolerances conform to ANSI 
Y14.5M-1982. 


2. Datum plane-H-is located at the mold parting line 
and coincident with the bottom of the lead where 
lead exits plastic body. 


3. Datums A-B and -D- are to be determined where 
center leads exit plastic body at datum plane -H-. 

4. Controlling dimension is the inch. 

5. Dimensions D-j, Dz, and Ez are measured at 
the mold parting line and do not include mold pro- 
trusion. Allowable mold protrusion is 0.18 mm 
(0.007 in.) per side. 

6. Pin 1 identifier is located within one of the two 
zones Indicated. 

7. Measured at datum plane -H-. 

8. Measured at seating plane datum -C-. 

Table 6-2 provides outline characteristics for 

0.025-in. pitch. 


Table 6-2. Intel Case Outline Drawings 
for PQFP at 0.025 Inch Pitch 


Symbol 

Description 

Min 

Max 

N 

Leadcount 

132 

132 

A 

Package Height 

0.160 

0.180 

A, 

Standoff 

0.020 

0.040 

d,e 

Terminal Dimension 

1.070 

1.090 

Dl.E, 

Package Body 

0.947 

0.953 

D 2 , E 2 

Bumper Distance 

1.097 

1.103 

D 3 , E3 

Lead Dimension 

0.800 REF 

0.800 REF 

D4, E4 

Foot Radius 

Location 

1.023 

1.037 








mm (inch) 240855-32 

Figure 6-1. Principal Dimensions of the 82750DB in the 132-Lead PQFP Package 
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-0.41 (.016) 
0.20 (.008) 


0.31 (.012) H K 
0.20 (.008) 


0.20 (.008)® LclACD-a® lO^ 



0.20 (.008) 
0.14 (.005) 


DETAIL J 


lTh I- 


DETAIL L 


Figure 6-2. 132-Lead PQFP Mechanical Package Detail— Typical Lead 


i.32 (.052) 
1.22 (.048) 



1.32 (.052) 
1.22 (.048) 


\ 9 } I 

(.035) MIN. -H 

2.03 (.080 ) A 
1.93 (.076) 


0.90 (.035) MIN. 


2.03 (.080) 
1.93 (.076) 


mm (inch) 

Figure 6-3. 132-Lead PQFP Mechanical Package Detail — Protective Bumper 


0 0.25 (.010)(M)|C|A(1)-B(I) iDdri 
I _L| .002 MM/MM (IN/IN) jA-Bl 

S |0.25 (.010)® |C|A(s)-B(g)|D® i 
1.002 MM/MM (IN/IN)Ta-B| 

L 

3.81 ( .150) MAX TYP 


_SEE detail M 


H K 1.^1 ( .075) MAX TYP 
0|0.25 (.010)(g)|C|A(s)-BCs)|O<^ 

_l1.002 mm/mm (IN/IN)Td| 

0.25 (.010)(g)|c|A(|)-B(D|O(s)L4 

.002 MM/MM ( IN/IN) |o| 


mm (inch) 

Figure 6-4. Detailed Dimensions of the 82750DB in the 132-Lead PQFP Package — Moided Details 







82750DB 




mm (inch) 


1H 


0.^35 ( 0 . 02 ^ 



SEE 

SEE 


DETAIL L 

detail J 


240855-35 


Figure 6-5. Detailed Dimensions of the 82750DB in the 132-Lead PQFP Package — Terminal Details 


NOTES* 


/l\ ALL OirCNSIONS AMD TXERANCES CONFORi TO ANSI Y14.5«-m2 

DATUM PLA^€ CH3 LOCATED AT THE HOLD PAPTINO LINE AND 

COINCIDENT flTH BOTTOM OF THE LEAD fV€R£ LEAD EXITS PLASTIC BODY 


DATUMS (a3 AM) OS TO BE 0ET£7»1INED f^€RE CENTER LEADS EXIT 
PLASTIC BODY AT DATUM PLAfC EB3 


/i\ CONTRCLLINO DlfCNSIQN, INCH 


DI^€NSIONS Dl, 02, El AM) E2 ARE rCASUREO AT Th€ HOLD PARTING LIAC. 
01 AM) El DO NOT INCLUDE AN ALLOWABLE HOLD PROTRUSION OF 1.18 m 
<.§•7 IN) PER SIDE. 02 AM) £2 DO not INCLUDE A TOTAL ALLOWABLE 
hold PROTRUSION OF 0.18 Ml (.807 IN) AT MAXIMUM PACKAGE SIZE. 

PIN 1 IDENTIFIER IS LOCATED WITHIN a^€ OF T)C TWO ZOfCS IM)ICATED 
^€ASUREO AT DATUM PLAfC gH3 
A\ <ASUR£0 AT seating PLA^€ DATUM £E3 


240855-37 
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Package Thermal Specifications 

The 82750DB is specified for operation when Tc 
(the case temperature) is within the range of 0®C to 
95®. Tc may be measured in any environment to de- 
termine whether the 82750DB is within specified op- 
erating range. The case temperature should be mea- 
sured at the center of the top surface. 

T/\ (the ambient temperature) can be calculated 
from ^CA (thermal resistance from case to ambient) 
with the following equation: 


Ta = Tc - P * 0CA 

Typical values for ^CA at various airflows are given 
in Table 6-3 for the 132-lead PQFP package. When 
using the digital outputs, Table 6-4 shows the maxi- 
mum Ta allowable (without exceeding Tq) at various 
airflows. The power dissipation (P) Is calculated by 
using the typical supply currents at 5V as shown in 
Table 5-2. 

Similarly, when using the analog outputs, the maxi- 
mum Ta allowed is a function of Ifs. The equation for 
calculating the power is given in the following 
equation which can then be used in calculating the 
maximum Ta- 

P = 5V*(Iccnt + (3*I,3 + 6)) 


Table 6-3. Therttian Resistances (°C/W) 



^CA Versus Airflow — ft/min (m/sec) 

Package 

0 

(0) 

200 

(1.01) 

400 

(2.02) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

132-Lead PQFP 

26.0 

17.5 

14.0 

11.5 

9.5 

8.5 


Table 6-4. Maximum Ta at Various Airflows (°C) 



T A Versus Airflow— ft/min (m/sec) 

Package 

Frequency 

(MHz) 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

132-Lead PQFP 

28 

71 

79 

82 

84 

86 

87 

45 

59 

71 

75 

79 

82 

83 
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82750PB 

PIXEL PROCESSOR 


■ 25 MHz Clock with Single Cycle 
Execution 

■ Zero Branch Delay 

■ Wide Instruction Word Processor 

■ 512 X 48-Bit Instruction RAM 

■ 512 X 16-Bit Data RAM 

■ Two Internal 16-Bit Buses 

■ ALU with Dual-Add-With-Saturation 
Mode 

■ Variable Length Sequence Decoder 


■ Pixel Interpolator 

■ High Performance Memory Interface 

— 32-Bit Memory Data Bus 

— 50 MBytes per Second Maximum 

— 25 MBytes per Second with Standard 
VRAMs or DRAMs 

■ 16 General-Purpose Registers 

■ 4 Gbyte Linear Address Space 

■ 132-Pin PQFP 

■ Compatible with the 82750PA 


Intel’s 82750PB is a 25 MHz wide instruction processor that generates and manipulates pixels. When paired 
with its companion chip, the 82750DB, and used to implement a DVI Technology video subsystem, the 
82750PB provides real time (30 images/sec) pixel processing, real time video compression, interactive motion 
video playback and real time video effects. 


Real time pixel manipulations, including 30 images/sec video compression, are supported by the 25 MHz 
instruction rate. On-chip instruction RAM provides programmability for execution of a wide range of algorithms 
that support motion video decompression, text, and 2D and 3D graphics. Inner loops are optimized with the 
integration of sixteen 16-bit quad ported registers, on-chip DRAM, and two loop counters that provide zero 
delay two-way branching “free” In any instruction. Two, 16-blt Internal buses enable two parallel register 
transfers on each 82750PB instruction, contributing to the real time performance of the video processing. 
Another feature that adds to the processing power of the 82750PB is the 16-bit ALU, which includes an 8-bit 
dual-add-with-saturate operation critical for pixel arithmetic. Other specialized features for pixel processing 
include a 2D pixel interpolator for Image processing functions and a variable length sequence decoder for 
decoding compressed data. 

The 82750PB Is implemented using Intel’s low-power CHM'OS IV Technology and is packaged In a 132-lead 
space-saving, plastic quad flat pack (PQFP) package. 


ADDRESS , 


Video Output 


82750PB 

4 VBUS[3:0] 



Video Input 


82750PB Subsystem Diagram 


240854-1 


Intel Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in an Intel product. No other circuit patent 
licenses are implied. Information contained herein supersedes previously published specifications on these devices from Intel. February 1991 
©INTEL CORPORATION, 1991 ct Order Number: 240854-003 
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82750PB Pixel Processor 
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Table 1-1. Pin Cross Reference by Pin Name 


Pin 

Name 

Location 

A2 

71 

A3 

72 

A4 

73 

A5 

74 

A6 

77 

A7 

78 

A8 

79 

A9 

80 

A10 

81 

All 

83 

A12 

84 

A13 

85 

A14 

86 

A15 

87 

A16 

88 

A17 

90 

A18 

92 

A19 

93 

A20 

95 

A21 

96 

A22 

97 

A23 

102 

A24 

103 

A25 

105 

A26 

106 

A27 

107 

A28 

110 

A29 

111 

A30 

112 

A31 

113 

BEO# 

44 

BE1# 

43 

BE2# 

. 42 


Pin 

Name 

Location 

BE3# 

41 

CLKIN 

47 

CLKOUT 

114 

DO 

28 

D1 

27 

D2 

26 

D3 

24 

D4 

23 

D5 

22 

D6 

20 

D7 

19 

D8 

18 

D9 

16 

DIO 

15 

Dll 

14 

D12 

13 

D13 

12 

D14 

11 

D15 

9 

D16 

8 

D17 

7 

D18 

6 

D19 

5 

D20 

4 

D21 

3 

D22 

130 

D23 

129 

D24 

128 

D25 

126 

D26 

125 

D27 

122 

D28 . 

121 

D29 

120 


Pin 

Name 

Location 

D30 

119 

D31 

118 

HALEN# 

55 

HALT# 

31 

HBUSEN# 

36 

HINT# 

30 

HRAM# 

58 

HRDY# 

38 

HREG# 

40 

HREQ# 

56 

MRDY# 

60 

MREQ# 

59 

NXTFST# 

61 

PMFRZ# 

70 

RESET# 

63 

RFSH# 

62 

TEST# 

69 

TRNFR# 

37 

VBUS[0] 

54 

VBUS[1] 

53 

VBUS[2] 

52 

VBUS[3] 

50 

Vcc 

2 

Vcc 

33 

Vcc 

35 

Vcc 

45 

Vcc 

51 

Vcc 

65 

Vcc 

67 

Vcc 

75 

Vcc 

82 

Vcc 

91 

Vcc 

98 


Pin 

Name 

Location 

Vcc 

100 

Vcc 

104 

Vss 

94 

Vcc 

109 

Vcc 

116 

Vcc 

123 

Vcc 

127 

Vcc 

132 

Vss 

1 

Vss 

32 

Vss 

34 

Vss 

39 

Vss 

48 

Vss 

57 

Vcc 


Vss 

68 

Vss 

76 

Vss 

89 

Vss 

99 

Vss 

101 

Vss 

108 

Vss 

115 

Vss 

117 

Vss 

124 

Vss 

131 

Vss 

10 

Vss 

17 

Vss 

21 

Vss 

25 

Vss 

29 

Vss 

46 

Vss 

64 

WE# 

49 
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Table 1-2. Pin Cross Reference by Location 


Location 

Pin 

Name 

34 

Vss 

35 

Vcc 

36 

HBUSEN# 

37 

TRNFR# 

38 

HRDY# 

39 

Vss 

40 

HREG# 

41 

BE3# 

42 

BE2# 

43 

BE1# 

44 

BEO# 

45 

Vcc 

46 

Vss 

47 

CLKIN 

48 

Vss 

49 

WE# 

50 

VBUS[3] 

51 

Vcc 

52 

VBUS[2] 

53 

VBUS[1] 

54 

VBUS[0] 

55 

HALEN# 

56 

HREQ# 

57 

Vss 

58 

HRAM# 

59 

MREQ# 

60 

MRDY# 

61 

NXTFST# 

62 

RFSH# 

63 

RESET# 

64 

Vss 

65 

Vcc 

66 

Vss 


Location 

Pin 

Name 

100 

Vcc 

101 

Vss 

102 

A23 

103 

A24 

104 

Vcc 

105 

A25 

106 

A26 

107 

A27 

108 

Vss 

109 

Vcc 

110 

A28 

111 

A29 

112 

A30 

113 

A31 

114 

CLKOUT 

115 

Vss 

116 

Vcc 

117 

Vss 

118 

D31 

119 

D30 

120 

D29 

121 

D28 

122 

D27 

123 

Vcc 

124 

Vss 

125 

D26 

126 

D25 

127 

Vcc 

128 

D24 

129 

D23 

130 

D22 

131 

Vss 

132 

Vcc 


Location 

Pin 

Name 

1 

Vss 

2 

Vcc 

3 

D21 

4 

D20 

5 

D19 

6 

D18 

7 

D17 

8 

D16 

9 

D15 

10 

Vss 

11 

D14 

12 

D13 

13 

D12 

14 

Dll 

15 

DIO 

16 

D9 

17 

Vss 

18 

D8 

19 

D7 

20 

D6 

21 

Vss 

22 

D5 

23 

D4 

24 

D3 

25 

Vss 

26 

D2 

27 

D1 

28 

DO 

29 

Vss 

30 

HINT# 

31 

HALT# 

32 

Vss 

33 

Vcc 


Location 

Pin 

Name 

67 

Vcc 

68 

Vss 

69 

TEST# 

70 

PMFRZ# 

71 

A2 

72 

A3 

73 

A4 

74 

A5 

75 

Vcc 

76 

Vss 

77 

A6 

78 

A7 

79 

A8 

80 

A9 

81 

A10 

82 

Vcc 

83 

All 

84 

A12 

85 

A13 

86 

A14 

87 

A15 

88 

A16 

89 

Vss 

90 

A17 

91 

Vcc 

92 

A18 

93 

A19 

94 

Vss 

95 

A20 

96 

A21 

97 

A22 

98 

Vcc 

99 

Vss 
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Quick Pin Reference 

Table 1-3 provides descriptions of 82750PB pins. 


Table 1-3. Pin Descriptions 


Symbol 

Type 

Name and Function 

CLKIN 

I 

CLKIN is a IX CLOCK INPUT that provides the fundamental timing for the 
82750PB. One cycle of CLKIN is denoted as one T-cycle. 

RESET# 

I 

The 82750PB is reset and initialized by holding this signal active for at least ten 
T-cycles. Refer to Initializing the 82750PB Section in Chapter 3. 

HREQ# 

I 

The HOST REQUEST signal is a request from the host CPU to perform a read 
or write access to either registers on the 82750PB, an external device, or to 

VRAM shared by the 82750PB and the host. The type of access that Is 
requested is determined by the host access definition signals: HREG#, 

HRAM#, and WE#. 

HREG,# 

HRAM# 

I 

The HOST REGISTER and HOST RAM signals, when validated by HREQ#, 
are used to define three host access cycles. HRAM# active Indicates the host 
is requesting a VRAM read or write cycle. HREG# active indicates that the 
host is requesting a 82750PB register read or register write cycle. When both 
signals are Inactive, a host external cycle is requested. 

HBUSEN# 

o 

HOST BUS ENABLE is asserted by the 82750PB at the start of a host access 
to indicate that the 82750PB Address and Data buses (A [3 1:2], BE# [3:0], and 
D[31:0]) have been tri-stated. This allows the host to drive the same buses 
either for accessing shared VRAM or the 82750PB internal registers. 

HALEN# 

I 

The HOST ADDRESS LATCH ENABLE signal is used to indicate to the 

82750PB that the host has asserted a valid address (A[31:2], BE# [3:0]) and 
write enable (WE#). 

HRDY# 

0 

HOST READY is asserted by the 82750PB at the end of a host access to 
indicate that the access cycle is ready for data transfer. For a host write cycle, 
HRDY# indicates that the 82750PB is ready to accept data from the host. For 
a host VRAM write cycle, HRDY# indicates that the VRAM has latched the 
data from the host. For a host read cycle, HRDY # indicates that output data 
from the 82750PB or VRAM is ready to be latched by the host. 

HINT# 

o 

HOST INTERRUPT: This output is asserted when an interrupt condition is 
detected by the 82750PB, and the enable bit in the PROCESSOR CONTROL 
register corresponding to that interrupt condition is set to a ONE. HINT # stays 
active until the host CPU reads the INTERRUPT STATUS register. If an 
interrupt condition that is enabled occurs during the same cycle that the 
INTERRUPT STATUS register Is being read, HINT # remains active. 

D[31:0] 

I/O 

The DATA BUS is used to transfer data between: 

1 . The 82750PB and VRAM, and 

2. The Host CPU and internal 82750PB registers. During host VRAM accesses, 
this bus is tri-stated to allow the host to share the same VRAM data bus. During 
host accesses to internal 82750PB registers all 32 bits are used for data 
transfer. 

A[31:9] 

0 

The ADDRESS BUS Is shared between the 82750PB and the host for 

A[8:2] 

I/O 

addressing VRAM. This 30-pin bus addresses 32-bit double words in VRAM. 

Byte Enable signals are used to address individual bytes or words within a 
double word in VRAM. In addition, the address for host accesses to internal 
82750PB registers are communicated to the 82750PB using the lower seven 
pins, A [8:2], and the BE# pins. During host access cycles to either VRAM or 
82750PB internal registers, A[31:2] are tri-stated. For Internal register 
accesses, as indicated by HREG# being low, the lower seven bits, A[8:2], are 
used as the host address input. 

CLKOUT 

0 

The CLOCK OUTPUT signal is one of the two internal clocks and is 
synchronized with CLKIN. It Is always driven and will have a 50% duty cycle. 
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Table 1-3. Pin Descriptions (Continued) 


Symbol 

Type 

Name and Function 

BE# [3:0] 

I/O 

The BYTE ENABLE BUS is shared by the 82750PB and the host for 
addressing VRAM down to the byte level. The correspondence between 
the four Byte Enable pins and the D [3 1:0] pins is: BE # [3] -D [31:24], 

BE # [2] - D [23: 1 6] , BE # [1 ] - D [1 5:8] , and BE # [0] - D [7:0] . During VRAM 
read cycles, the 82750PB enables all four bytes. During write cycles the 
82750PB only enables those bytes that are to be written. Bytes that are 
not enabled are not to be altered in VRAM. During host accesses to 

82750PB on-chip registers, the BE# [0] pin is used as an input to select 
whether the even or odd word is being accessed; the double word 
address is provided by the host on the A [8:2] pins. BE# [0] = 0 indicates 
that data Is transferred on D[1 5:0]. BE# [0] = 1 indicates that data is 
transferred on D[31:16]. 

MREQ# 

0 

MEMORY REQUEST is asserted for the first cycle, T1 , of each VRAM 
cycle. 

TRNFR#, 

RFSH# 

0 

The MEMORY CYCLE DEFINITION SIGNALS: Transfer, Refresh and 

Write Enable are asserted at the same time as MREQ#, but stay active 
for the entire VRAM cycle. TRNFR# active Indicates a VRAM transfer 
cvcie. RFSH# active indicates a VRA.M refresh cvcie. If .neither TRNFR # 
nor RFSH# are active, a VRAM data read or write cycle is requested. 

WE# 

I/O 

The WRITE ENABLE pin is used as an output during a 82750PB/VRAM 
cycle to drive the WE# signal, which defines the access as a VRAM read 
cycle (when inactive) or write cycle (when active). During Host/VRAM 
and Host External cycles, the 82750PB tri-states this pin to allow the host 
to drive the VRAM write enable signals directly. During Host/register 
cycles, this pin is used as an input for the Host Write Enable signal to 
determine whether the host is reading or writing the 82750PB register. 

NXTFST# 

0 

The NEXT FAST signal indicates that the following vram cycle can be 
performed with a page-mode or bank-interleaved access. This signal is 
asserted during the first of a pair of VRAM cycles that is guaranteed to be 
within the same VRAM page and in opposite banks— a pair of accesses 
to two sequential double words in VRAM at addresses Even Address and 
Even Address + 1 . In other words, A [2] is a zero for the first cycle and a 
one for the second cycle. 

MRDY# 

I 

The MEMORY READY Input Indicates that the VRAM cycle has 
progressed to the point where it is ready to perform the data transfer. For 
a VRAM read cycle, the VRAM data can be latched by the transition of 
MRDY# to an active state. For a VRAM write cycle, MRDY# Indicates 
that the data has been latched into the VRAMs. 

VBUS[3:0] 

I 

The VDP COMMUNICATION BUS is used to communicate from the 

82750DB to the 82750PB. Codes sent over this bus Indicate interrupt 
requests, transfer requests, and status Infprmation. Since the 82750DB 
and 82750PB run asynchronously, the VBUS signals are sampled on the 
falling edge of CLKIN and compared with the previous sample. For a 

VBUS code to be detected by the 82750PB, it must be valid for two 
successive samples. 

HALT# 

I 

The HALT signal causes the microcode processor on the 82750PB to 
halt prior to executing the next instruction. This signal does not halt the 

VRAM Interface. The Halt signal will allow the design of a hardware 
emulator for the 82750PB based on an 82750PB chip. 

TEST# 

I 

The TEST signal is used for test purposes only and must remain high for 
normal operation. 
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Table 1-3. Pin Descriptions (Continued) 


Symbol 

Type 

Name and Function 

PMFRZ# 

0 

The PERFORMANCE MONITORING AND FREEZE signal is toggled by 
specific microcode instructions and can be used to determine the time 
required to execute certain sections of microcode. 

Vcc 

1 

POWER pins provide the + 5V D.C. supply Input. 

Vss 

1 

GROUND pins provide the OV connection to which all Inputs and outputs 
are referenced. 


Table 1-4. Output Pins 


Table 1-6. Input/Output Pins 


Name 

Active Level 

When Floated 

Synch/Async 

D[31:0] 

High 

Reset*, Host Cycle 

Synchronous 

A[8;2] 

High 

Reset*, Host Cycle 

Synchronous 

BE #[3:0] 

Low 

Reset*, Host Cycle 

Synchronous 

WE# 

Low 

Reset*, Host Cycle 

Synchronous 


*The reset state is caused by RESET # being active low. 


All output pins are floated when RESET is active low. 


Name 

Active 

Level 

When 

Floated 

CLKOUT 

High 

Always Driven 

A[31;9] 

High 

Reset*, Host Cycle 

HBUSEN# 

Low 

Reset* 

HRDY# 

Low 

Reset* 

HINT# 

Low 

Reset* 

MREQ# 

Low 

Reset* 

TRNFR#, 

RFSH# 

Low 

Reset* 

NXTFST# 

Low 

Reset* 

PMFRZ# 

Low 

Reset* 


*The reset state is caused by RESET # being active low. 


Table 1-5. Input Pins 


Name 

Active 

Level 

Synchronous/ 

Asynchronous 

CLKIN 

High 

Synchronous 

RESET# 

Low 

Asynchronous 


Low 

Asynchronous* 

HREG# 


Synchronous 


Low 

Synchronous 

MRDY# 


Synchronous 

VBUS[3:0] 

High 

Asynchronous 

HALT# 

Low 

Synchronous 

HALEN# 

Low 

Asynchronous* 


*Can be programmed to accept synchronous inputs. 
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2.0 ARCHITECTURE functionally identical except rO, which also includes 

logic for bit shifting and byte swapping. A register 
can source both the A bus and the B bus in the 
Overview same cycle. A register cannot be the destination of 

both the A bus and the B bus in a single instruction. 
The 82750PB includes a wide instruction word Because the registers are doubly latched, the same 

processor that comprises a number of processing, register may be both a source and destination in the 

storage, and input/output elements. The wide In- same cycle. The result Is that the data in the register 

struction word architecture allows a number of these prior to the current cycle will be driven on the source 

elements to operate in parallel. The 82750PB exe- bus, and the data on the destination bus will be 

cutes one Instruction every internal clock cycle or latched into the register at the end of the cycle. 

T-cycle. The various elements are connected via 

two 16-bit buses, the A bus and B bus, as shown In Register rO has additional logic to allow bit shifting 
Figure 2-1 . During each instruction execution cycle, and byte swapping. The value in rO can be shifted 

data can be transferred from a bus source to a bus left or right one bit position per Instruction cycle. For 

destination element on both buses. a right shift, the new MSB is equal to the old MSB; in 

other words, the value is sign-extended. For left 
shifting, the new LSB is equal to zero. RO can not be 
Registers shifted and loaded in the same instruction. Byte 

swapping, on the other hand, only occurs when rO is 
[rN; N = 0- 15] being loaded with a value from the A bus or B bus. 

Byte swapping causes the most significant byte and 
There are 16 general-purpose data registers, each the least significant byte of the 16-bit value being 

16 bits wide, that are connected to both the A bus loaded into rO to be Interchanged. Refer to Chapter 

and B bus as both sources and destinations. These 4 fQi^ ^ description of the SHFT microcode field that 

registers are designated r0-r15. All the registers are controls the shifting and swapping operations in rO, 



Figure 2-1. 82750PB Block Diagram 
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ALU 

[alu, cc\ . 

The ALU performs 16-bit arithmetic and logic opera- 
tions, and can also be operated as two independent 
8-bit ALUs for the Dual-Add-with-Saturate operation. 
There are two fields in the microcode instruction that 
affect the operation of the ALU: the ALUOP field 
specifies the operation to be performed, and the 
ALUSS field specifies the source of the two ALU 
Inputs. Refer to Chapter 4 for further information on 
these fields. 

The two ALU operands either come from values 
held in the ALU Input latches or from “eavesdrop- 
ping” on the A or B buses. The result of any ALU 
operation Is latched In the ALU output register, alu. 
In a subsequent Instruction this result can be trans- 
ferred to any A or B destination. 

The ALU has four condition flag outputs: CarryOut, 
Sign, Overflow, and Zero. CarryOut is the carry out 
of the most significant bit position. Sign is equal to 
the value of the most significant bit of the result. 
Overflow is the exclusive-OR of CarryOut and the 
Carryin to the most significant bit position of the re- 
sult. Zero is true (a value of 1) if all 16 bits of the 
ALU result are equal to zero. CarryOut and Overflow 
are defined as equal to zero for all logical opera- 
tions. For most ALU operations, the state of these 
four condition flags are latched when the operation 
is complete. There are eight operations (nop, a*, b*, 
+], ■], 0*, prof and int) that are exceptions. These 
operations are performed without disturbing the 
condition state of the previous ALU operation. 

Microcode routines can read and write the ALU con- 
dition flag register, cc. This can be used to save and 
restore the state of these flags. The bit ordering of 
the ALU condition flags within cc are given in Table 
2-1 .A complete list of ALU opcodes Is given In Table 
2 - 2 . 


Table 2-1. Bit Assignments for cc Register 


Bit 

Condition 

BitO 

False (This bit of the cc is always read as 
a zero.)* 

Biti 

ALU Carry Out 

Bit 2 

ALU Overflow 

Bit 3 

ALU Sign 

Bit 4 

ALU Zero 

Bits 

Loop Counter Zero* 

Bit 6 

ROLSB* 

Bit 7 

RO MSB* 

Bit 15:8 

RESERVED. The state of these bits is 
undefined when read; write as zeros. 


* These are read-only values and are not affected by writes to the cc 
register. 
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Table 2-2. ALU Opcodes 


Operation 

Mnemonic 

No Operation 

nop 

pass a 

a 

pass b 

b 

1 ’s compliment of a 

~a 

1 ’s compliment of b 

-b 

a AND b 

& 

(NOT a) AND b 


a AND (NOT b) 


aORb 

1 

aXORb 

A 

a + b 

+ 

a + b + 1 

+ + 

a - b 

- 

-a + b 

- + 

2’s compliment of a 

-a 

2’s compliment of b 

-b 

Increment a 

a+ + 

Increment b 

b+ + 

Decrement a 

a — 

Decrement b 

b-- 

Dual Add with Sat. 

+] 

a + b + (Prev. Carry) 

+ < 

a - b - (Prev. Borrow) 

- < 

- a + b - (Prev. Borrow) 

- + < 

Interrupt Host 

Int 

Zero 

0* 

Pass a. Don’t Latch Flags 

a* 

Pass b. Don’t Latch Flags 

b* 

' (NOT a) OR b 

~i 

a OR (NOT b) 

i~ 

Dual Sub. with Sat. 

-] 

Perform. Monitor/Profile 

prof 


The Dual-Add-with-Saturate operation performs in- 
dependent 8-blt ADDS on the upper and lower bytes 
of the two ALU operands. The two bytes of the A 
operand are treated as unsigned binary numbers 
(00 :FFi 6 corresponds to ’0:255io)- The two bytes of 
the B operand are treated as offset binary numbers 
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with an offset of +128 (00:FFi6 corresponds to 
-128-|o:127io)- The upper and lower byte results 
are treated as 9-bit offset binary, including the carry 
output of each byte, with a + 128 offset (000:1 FF-ie 
corresponds to -128io:383io) and are saturated to 
a range of 0-255io- A result that is less than zero is 
set equal to zero or OO-ie and a result that Is greater 
than +255 is set equal to +255 or FF^e- 

In fact, this operation Is symmetric. Either the A op- 
erand or the B operand can be defined as the un- 
signed binary value, and the other operand will be 
treated as the offset signed binary value. 

Dual-subtract-with-saturate is similar to dual-add- 
wlth-saturate. It calculates A - B + 128 on each 
8-bit half of the two 16-bit inputs, and clamps the 
results to 0 and 255. This can be viewed as subtract- 
ing an offset-binary signed byte ( - 1 28 to 1 27) from 
an unsigned byte (0 to 255). 

The ALU opcode ‘int’ generates the MC;h4T (micro- 
code interrupt) condition. When this condition Is de- 
tected by interrupt logic In the host CPU interface, 
and if the Enable MCINT bit in the PROCESSOR 
CONTROL register Is set to a ONE, the host inter- 
rupt output, HINT#, will be asserted. Refer to Chap- 
ter 3 for further information on host interface. 

The ‘prof opcode activates the PMFRZ# pin, and is 
primarily used for performance monitoring and/or 
debugging. 

Barrel Shifter 

[shift, shift-r, shift-rl, shift-/] 

The barrel shifter performs a single cycle, n-bit left or 
right shift. The barrel shifter operates Independent of 
the ALU. The three barrel shifter operations arp: 
Shift-r for a right shift with sign extend; Shift-r! for 
right shift with zero fill; and Shift-! \ox a left shift with 
zero fill. The shift operation Is invoked by writing a 
4-bit value (the shift amount) to one of three A bus 
registers, depending on which of the three opera- 
tions Is to be performed. The operand is taken from 
the B bus, and the result is stored in the barrel shift- 
er output register. Shift. Like the ALU result register, 
the value in Shift can be read onto the A bus or B 
bus In the following instruction cycle. 

A barrel shifter operation does not affect any of the 
condition flags. 


Data RAM 

[dramN, *dramN, + +, — ; A/ = 1-4] 

The Data RAM holds 512, 16-blt words that are ac- 
cessed using four pointers. To access a value in a 
particular location, the microcode routine must first 
load a pointer with the address to be accessed, and 
then perform a read or write using the same pointer. 
In parallel with the data RAM access, the pointer 
can optionally be post-incremented or post-decre- 
mented. The four pointers, referred to as draml- 
dram4, can be written and read via the A bus. When 
a dram pointer, which Is only 9 bits wide. Is read onto 
the A bus, its upper seven bits, are set to zeros. 

NOTE: 


The width of the dram pointers may change in 
iater versions of the d2750PB. Software shouid 
not re!y on the width of a pointer to, for exam- 
p!e, mask the upper seven bits of a vafue to 
zero. 


All four pointers can be used to read or write the 
Data RAM from either the A or B bus. Only one Data 
RAM access can be performed in any cycle. A Data 
RAM access Is referred to, using C language syntax, 
as *dram1. The * means “the value pointed to by”. 
As another example, *dram3-\- + means access the 
Data RAM using the pointer dram3 and Increment 
dram3. The symbol - - in place of the + + would 
indicate autodecrement. 


Loop Counters 

[cnt.cntZ] 

Two 16-bit loop counters are available to microcode 
programs for automatically counting iterations of a 
microcode loop. In parallel with other operations 
performed In an instruction, either loop counter can 
be decremented, and a conditional branch can be 
made based on the loop counter value being equal 
or not equal to zero. Since the two loop counters 
can be written and read on the A bus, as ent and 
cnt2 respectively, they can also be used for variable 
storage when not being used as loop counters. The 
loop counters can be written to and decremented 
during the same instruction cycle. The value In the 
counter at the start of the next cycle will be the value 
written to the counter minus one. 

The LC microcode bit determines the loop counter 
that is selected for decrementing and/or branching 
in an Instruction. The LC microcode bit does not af- 
fect the loop counter that is written or read over the 
A bus, since each loop counter is separately ad- 
dressable as a A bus source or destination. Refer to 
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Chapter 4 for a description of the CNT — micro- 
code bit that causes the select loop counter to be 
decremented, and for a description of the CFSEL 
microcode field that is used to perform a conditional 
branch based on the selected loop counter’s value. 


Microcode RAM 

{mcode1-3, maddr, pc] 

The 82750PB executes instructions stored in an on- 
chip microcode RAM. This RAM holds 512 Instruc- 
tions and each instruction is 48 bits wide. Normally, 
to start the microcode processor, the host CPU will 
load a microcode program into the microcode RAM, 
point the program counter, pc, to the start of the 
program, and then release the HALT bit to start exe- 
cuting the microcode program. The microcode proc- 
essor can also load its own microcode RAM to over- 
lay new routines and therefore, does not require 
constant intervention by the host to perform multiple 
operations. 

Writing an instruction into Microcode RAM Is done 
by first loading the three registers mcodeS, mcode2, 
and mcodel with the three 16-bit words of the In- 
struction (the most significant word goes into 
mcodel ), and then loading the address where the 
instruction should be written into maddr. 

The host CPU can also read the Microcode RAM by 
first loading the pc with the address of the instruc- 


tion to be read and then reading the three 16-bit 
words of the instruction from the mcodel -mcodeS 
registers. Normally, this would be done by the Host 
CPU while the 82750PB Is halted. Since mcodel- 
mcodeS hold the Instruction pointed to by the pc (I.e. 
the instruction that is about to be executed), normal- 
ly reading these three registers from a microcode 
routine is not useful. 

The read registers named mcodel -mcodeS and the 
write registers also named mcodel -mcodeS are in 
fact different registers. Writing values into mcodel - 
mcode3 and then reading the values of mcodel - 
mcode3 will not read back the same values just writ- 
ten. The read registers hold the Instruction stored Ir 
the instruction latch (the instruction to be executed). 
The write registers hold an instruction that is about 
to be written into microcode RAM. 

After writing to maddr to load an instruction into mi- 
crocode RAM, a one cycle freeze occurs and during 
the freeze a write to the microcode RAM takes 
place. The instruction following the write to maddr 
can either jump to the address just loaded or start 
loading the mcodel -mcodeS registers with the next 
instruction to be written. 

Here are two examples that illustrate the fact that 
the 82750PB requires at least one instruction be- 
tween the write to maddr and the execution of the 
instruction that is loaded by the write to maddr. 


Example 1: 

maddr = ADDRl 
Jmp addrl 


ADDRl; 

??????????? 

Example 2: 

maddr = INST 

nop 

INST; 

??????????? 


/* load instruction */ 

/* jump to it, this is the extra inst. required between */ 
/* writing to maddr and executing the loaded inst. */ 


/* here's where new instruction gets loaded */ 


/* extra instruction */ 

/* instruction gets loaded here */ 


When a microcode routine writes to pc, one more instruction is executed before the jump to the new address 
takes effect. For example: 

pc = ADDRl 

rO = rl jmp ADDR2 /* this instruction gets executed but */ 

/* its jump to ADDR2 is ignored. */ 

ADDRl; 

r3 = rO /* after this instruction executes r3 =,r0 = rl */ 
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When the host CPU writes to the pc, the instruction 
at the address that was written is loaded into the 
mcodel -mcodeS registers and, when the micro- 
code processor is released from Its Halt condition, 
this is the first instruction that will be executed. 

When the host CPU reads the pc, the result returned 
is the address of the instruction that will be executed 
when Halt is released, that is, the address of the 
instruction held in the mcode1-mcode3 registers. 


Horizontal Line Counter 

[lent] 

The 12-blt Horizontal Line Counter Is updated by 
VBUS codes from the 82750DB to track the horizon- 
tal display line that is currently being scanned by the 
82750DB. The counter is reset by a VODD code and 
incremented each time an HLINE code is received. 
A value can also be written, into a Horizonte! Llnc 
Counter but this is used primarily for testing the 
82750PB. The upper four bits will always read zeros. 


Field Counter 

( font] 

The 4-bit- field counter is updated by VBUS codes 
from the display processor to keep track of the field 
count being displayed by 82750DB. The counter is 
incremented each time a Vqdd code or Veven code 
is received. When reading the field counter, the up- 
per 12 bits will read zeros. This counter will not be 
initialized upon reset. 


Input FIFOs 

[inN-lo, inN-hi, inN-c, *inN; N = 1,2] 

There are two input channels, referred to as input 
FIFOs, through which the processor can read pixels 
or data from VRAM. Each channel automatically 
fetches 64-bit quad words from VRAM and breaks 
them into 8-bit bytes or 16-bit words that are read by 
microcode. Each input FIFO operates Independently 
and can be programmed to automatically Increment 
or decrement through bytes or words in VRAM. The 
FIFOs are double buffered so that while values are 
being extracted from one quad word (64 bits), the 
next quad word is being prefetched from VRAM. 


The mode control register for each input FIFO, des- 
ignated inl-c or in2-c, contains four mode bits as 
seen in Figure 2-2. The WORD/BYTE bit (bit 0) de- 
termines whether the input FIFO is in word mode 
(WORD/BYTE = 0) or byte mode (WORD/BYTE = 
1). In byte mode, the FIFO can start reading on any 
byte boundary and in word mode on any word 
boundary. 

The INC/DEC bit (bit 1) determines, the order that 
bytes or words are read from VRAM. In INCRE- 
MENT mode, with INC/DEC = 0, the FIFO reads 
from the least significant byte or word to the most 
significant byte or word of each double word and 
increments through double words in VRAM. In DEC- 
REMENT mode, with INC/DEC = 1, the FIFO reads 
from most significant byte or word to least significant 
byte or word within a double word and decrements 
through double words in VRAM. 

The AHOLD bit (Bit 2) is used by the address hold 
rnoGs. vvi'iuii LioocneG, ^oii c. ~ tiie autoniaiic ad- 
dress increment/decrement function will be disabled 
and input FIFOs will not double buffer VRAM data. In 
other words, at the end of a VRAM cycle, when the 
FIFO has been updated with 64 bits of VRAM data, 
the input FIFO will not issue another MREQ# until 
there Is a write to the address-lo registers OR a roll- 
over/roll-under read access of the input FIFO. If a 
roll-over/roll-under occurs, then a memory request 
will be issued to fetch data from the same VRAM 
location. If there Is a write to the address-lo register, 
the FIFO will then fetch data from the new location. 

The PREFETCH OFF bit (bit 3) specifies whether 
the FIFO will automatically prefetch successive quad 
words from VRAM or will only fetch a new quad word 
when a value from that quad word is requested. In 
PREFETCH-ON mode, bit 3 = 0, the input FIFO pre- 
fetches successive quad words from VRAM as nec- 
essary to keep Its buffer full (either from ascending 
or descending addresses, depending on the state of 
the INC/DEC bit). In PREFETCH-OFF mode, the 
FIFO will still prefetch the first two quad words to fill 
its buffer (when started at a new address location), 
but will only fetch a new quad word when a read 
request is made to the FIFO for a value In the next 
unfetched quad word. 

The CB bit (bit 4) allows circular buffers of sizes 
64 Kbytes, 1 28 Kbytes, or 256 Kbytes to be created 
in VRAM memory. The choice of different sizes of 
buffers are determined by programming the least 
signficant 3 bits of the circular buffer register (cir- 


bits; 

15. ..6 

5 

4 

3 

2 

1 
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Set to Zeros 

BY-32 MODE 

CB 

PF OFF , 

AHOLD 

INC/DEC 

WORD/BYTE 


Figure 2-2. Input FIFO Control Register 
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cbuf). To enable this feature, the CB bit has to be 
set to a 1, then depending on the buffer size 
selected, the appropriate address pin that goes off 
chip will be forced to a 0 (register pointers remain 
unchanged). Table 2-3 shows the programming 
combinations of the circular buffer register. 

It is important to note that the internal address 
counters themselves are not affected by the circbuf 
function. Only the selected external address pin is 
forced to 'O'. 


Table 2-3. Circular Buffer Register (circbuf) 


Bits [2:0] 

Buffer Size 

Effect on PB Address Bus 
(If Function Enabled) 

000 

Disabled 

None 

100 

256 Kbytes 

Address Pin 1 8 Forced to 0 

010 

1 28 Kbytes 

Address Pin 1 7 Forced to 0 

001 

64 Kbytes 

Address Pin 1 6 Forced to 0 


In “BY-32” MODE (bit 3), the pointer increments or 
decrements by 32 bits, Independent of whether the 
FIFO Is In 8-blt pixel mode or 16-blt pixel mode. This 
mode was added to facilitate microcode that oper- 
ates on one component of a 32-blt per pixel image. 

The standard sequence for initializing an input FIFO 
Is to write to the control register (in-c), the high ad- 
dress (in-hi), and then the low address (in-lo) of the 
appropriate FIFO. Refer to the access state diagram 
in Chapter 3. The write to in-b causes the FIFO to 
start reading from VRAM. A byte or word is then 
read from *in. Successive reads from *in will read 
sequential bytes or words from VRAM. Writing to the 
control register each time the FIFO is started at a 
new address is not necessary, except to change the 
FIFO’s mode. Also, If the new address Is within the 
same 64 KByte page of VRAM, only the lo-address 
needs to be written in order to start the FIFO reading 
from the new address. 

If microcode attempts to read a value from an empty 
input FIFO, the processor is frozen prior to the exe- 
cution of the instruction, until the FIFO’s control log- 
ic has fetched another double word from VRAM and 
extracted the next value. At this point, the processor 
Is released from the frozen state, and the instruction 
that reads the value is executed. When the proces- 
sor is frozen waiting for a particular FIFO that isn’t 
yet ready, that FIFO’s VRAM access priority is raised 
above all other FIFOs. 


Output FIFOs 

[outN-lo, outN-hi, outN-c, *outN, outN-\- +; /V = 1,2] 

There are two output channels, referred to as output 
FIFOs, through which the graphics processor writes 
pixels or data to VRAM. Each channel automatically 
collects bytes or words into 64-bit quad words and 
writes the quad words to VRAM. Each output FIFO 
operates independently and can be programmed to 
write bytes or words into sequential addresses in 
VRAM (either incrementing or decrementing). The 
FIFOs are double buffered so that while one quad 
word is waiting to be written to VRAM, the next quad 
word can be assembled from individual bytes or 
words. 

The mode control register for each output FIFO, 
designated outl-c or out2-c, contains six mode bits 
as shown in the Figure 2-3. The WORD/BYTE bit 
(bit 0) determines whether the output FIFO is in word 
mode (WORD/BYTE = 0) or byte mode (WORD/ 
BYTE = 1). In byte mode the FIFO can start writing 
on any byte boundary in VRAM and in word mode on 
any word boundary. 

The INC/DEC bit (bit 1) determines the order that 
bytes or words are written to VRAM. In INCREMENT 
mode, with INC/DEC = 0, the FIFO writes from the 
least significant byte or word to the most significant 
byte or word in a double word and increments 
through double words in VRAM. In DECREMENT 
mode, with INC/DEC = 1, the FIFO writes from 
most significant byte or word to least significant byte 
or word within a double word and decrements 
through double words in VRAM. 

When the AHOLD bit (bit 2) Is set, the output FIFO 
quad word address is not incremented or decre- 
mented. In this mode, the FIFO continues to output 
to a single quad word in VRAM. 

The FORCE-LSB bits (bits 3 and 4) are used to force 
the least significant bit of each byte written to VRAM 
to either a zero or a one. This can be used, for ex- 
ample, to force the LSB to the correct polarity when 
writing to the U bitmap during motion video decom- 
pression. In certain display modes for the 82750DB, 
the LSB of the 8-bit samples In the U or Y bitmap are 
used to select VIDEO or GRAPHICS display mode 
for the n X n group of display pixels corresponding to 
the particular U or Y sample. A one in the FORCE- 
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Figure 2-3. Output FIFO Control Register 
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LSB ENABLE bit (bit 4) enables the forcing; a zero 
results in normal operation. The FORCE-LSB VAL- 
UE bit (bit 3) is used as the value to which the LSB is 
forced. Whether in byte mode or word mode, the 
LSB of each byte is forced to the FORCE-LSB value. 

In “BY-32” MODE (bit 5), the pointer increments or 
decrements by 32 bits, independent of whether the 
FIFO is in 8-bit pixel mode or 16-blt pixel mode. This 
mode is used to facilitate microcode that operates 
on one component of a 32-bit per pixel image. The 
bytes or words that are skipped over will be un- 
changed In VRAM. 

The standard sequence for Initializing an output 
FIFO is to write to the control register (ouhc), the 
low address (out-lo), and then the high address (out- 
hi) of the appropriate FIFO. A series of bytes or 
words is then written to *out Refer to the access 
state diagram in Chapter 3 (Figure 3-1). 

!n order to flush any rsmalning data in an output 

FIFO before changing its VRAM pointer, it is neces- 
sary to write to the control register. When pointing to 
a new location in VRAM, if the new address is within 
the same 64 kByte page of VRAM, only the lo-ad- 
dress needs to be written. 

There must be one instruction between the write to 
the output FIFOs low address and the first write to 
*outN. Therefore, it is recommended that outN-lo be 
written before outN-hl. The write to outN-hl insures 
that this requirement is met. If only the outN-lo value 
is being changed, it is still necessary to have one 
additional Instruction before the first write to *outN. 

When writing bytes or words to VRAM through an 
output FIFO, a byte or word can be skipped over by 
writing to outN-\- + instead of *outN. When the val- 
ues are written to VRAM, any byte or word that was 
skipped will retain its original value in VRAM, and its 
value is not altered by the VRAM write. This can be 
used when writing a series of pixels, some of which 
are “transparent”, allowing whatever was behind 
them to show through. 

If the microcode routine attempts to write a value to 
a full output FIFO, the processor is frozen prior to 
the execution of the instruction. The processor re- 
mains frozen until the FIFO has a chance to write 
one of the buffered quad words to VRAM. At that 
point, the processor is released from the frozen 
state, and the instruction that writes the value Is exe- 
cuted. When the processor is frozen, waiting for a 
particular FIFO that isn’t yet ready, that FIFO’s 
VRAM access priority is raised above all other 
FIFOs. 
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Statistical Decoder 

{stat-lo, stat-hi, stat-c, stat-ram, *stat, *stat#] 

The Statistical Decoder (also referred to as the Huff- 
man Decoder) is a specialized input channel that 
can read a variable-length bit sequence from VRAM 
and convert it Into a fixed-length bit sequence that is 
read by the microcode processor. In image com- 
pression, as well as in other applications such as 
text compression, certain values occur more fre- 
quently than others. A means of compressing this 
data is to use fewer bits to encode more frequently 
occurring values and more bits to encode less fre- 
quently occurring values. This type of encoding re- 
sults in a variable-length sequence in which the 
length of a symbol (the group of bits used to encode 
a single value) can range for example, from one bit 
to sixteen bits. 


The statistical code that the statistical decoder can 

dscGds is of either of the two forms: 


Ox 
lOx 
1 1 Oxxx 
1 1 1 0xxxxx 

... or 

1111111 Oxxxxxx 
111111110XXXXXX 


lx 
Olx 
001 XXX 
OOOIxxxxx 

00000001 xxxxxx 
000000001 xxxxxx 


Each symbol of a given length (one per line as 
shown here) consists of a run-in sequence followed 
by some number pf x-bits. The run-in sequence is 
defined as a series of zero or more ONEs followed 
by a ZERO or, as in the code on the right above, 
zero or more ZEROs followed by a ONE. The re- 
mainder of this description will use examples of the 
code on the left. A bit in the decoder’s control regis- 
ter determines the polarity of the run-in sequence 
bits. 

In the example on the left, there would be two sym- 
bols of length two: 00 and 01 . Each x-bit can take on 
a ZERO or ONE value. The number of x-bits follow- 
ing a run-in sequence can. range from zero to six. 
Since the goal, In general, is to have a few short 
codes and a larger number of long codes, typically, 
codes with fewer run-in bits will have fewer x’s fol- 
lowing. However, this Is not a hardware constraint. A 
code of this form is completely described by a code 
description table indicating: for each length of run-in 
sequence, R = the number of ONEs in the run-in, 
and how many x-bits follow the ZERO. The value of 
R is used as an index into the code description table. 
Due to the hardware Implementation, the number 
actually stored in the table is 2\ where x Is the num- 
ber of x-bits. 

For the example above, the corresponding code de- 
scription values are given in Table 2-4. 
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Table 2-4. Sample Code Description Table 


R 

X 

2x(dec.) 

2x(bin.) 

0 

1 

2 

000 0010 

1 

1 

2 

000 0010 

2 

3 

8 

000 1000 

3 

5 

32 

010 0000 

7 

6 

64 

100 0000 


Note that the table only goes up to symbols with 
seven ONEs in the run-in. For symbols with more 
than seven ONEs, the value of X and 2^ for seven 
ONES is used for all symbols having seven or more 
ONEs in the run-in sequence. For example, in the 
code above a symbol with eight or more ONEs in the 
run-in sequence has six x-bits following the ZERO, 
which is the same as symbols having seven ONEs. 

For each different symbol. Including all symbols of 
the same run-in length with different x-bit values, the 
decoder generates a unique fixed-length, 1 6-bit val- 
ue. Some of the decoded values for the sample 
code given above are provided in Table 2-5. 


Table 2-5. Decoded Values 


Symbol’^ 

Decoded Value 

00 

0 

01 

1 

100 

2 

101 

3 

110000 

4 

110001 

5 

110010 

6 



110111 

11 

111000000 

12 



111011111 

43 




*The x-bits of the symbol are in boldface for clarity. 


The algorithm for generating a decoded value from a 
symbol Is as follows: all symbols of a given run-in 
length are assigned a base value, B; the value corre- 
sponding to a particular symbol is equal to B plus the 
binary value of the x-bits In the symbol. The base 
valule B for a symbol with a run-in length of R is 
calculated by: 


where X(r) corresponds to the X value in the table 
entry corresponding to R = r. 

For example, in the above code: 

B(0) = 0, B (0) Is always zero 
B(1) = 0 + 2 = 2 

B(2) = 0 + 2 + 2 = 4 

B(3) = 0 + 2 + 2 + 8 = 12 

B(4) = 0 + 2 + 2 + 8 + 32 = 44 

This is one of the reasons that the table holds 2^ 
instead of X. The calculation of B(R) are easier to 
implement in logic. 

There are two enhancements that are made to this 
coding scheme In the implementation on the 
82750PB. These two modes are referred to as END 
mode and SHORT mode. If neither END nor SHORT 
mode are enabled, the decoding is performed as de- 
scribed above. SHORT mode allows the decoder to 
be switched easily to a simpler code format without 
having to reload the code description table. In the 
SHORT form, all symbols have the same number of 
x-bits, as though all entries In the table had been 
filled with the same value of 2^. When SHORT mode 
is invoked, this value of 2^ is obtained from a field In 
• the statistical decoder’s CONTROL word, instead of 
from the individual table entries. 

END mode is added in recognition of the fact that, 
for codes with few symbols, some increase in effi- 
ciency is possible by not having to place a zero at 
the end of the longest run-in sequence. For exam- 
ple, consider the code: 

0 

lOx 

110x 

The END mode allows us to shorten the last symbol 
to 11x Instead of 11 Ox. The trailing ZERO is not re- 
quired because the decoder has been told that the 
maximum length of a run-in is two ONEs. The result- 
ing symbol set and corresponding decoded values 
are given in Table 2-6. 


Table 2-6. END Mode Decoded Values 


Symbol 

Decoded Value 

0 

0 

100 

1 

101 

2 

110 

3 

111 

4 


B(R) = SUM[ 2 X(r)] with r = 0 to R - 1, 
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The number of x-bits must be constant for all sym- 
bols of the same run-in length. Therefore, a code 
such as: 

0 

10xx 

11 XXX NOT CORRECT! . . . Must be llxx. 

I 

is not allowed. The last symbol (llxxx, in this case) 
uses the same table entry for 2^ as the next to last 
symbol (lOxx) and, therefore, the last symbol will be 
llxx. 

The maximum length of the run-in sequence in END 
mode is specified by placing an END flag in the code 
description table. For example, a code and the cor- 
responding table is shown in Table 2-7. 


Table 2-7. END Flag Decoded Values 


Code 

Table Entries 

index 

END Bit 

2X 

0 

0 

0 

0 

lOxx 

1 

0 

4 

IIOxxx 

2 

1 

8 

1 llxxx 

3 

- 

- 


4 

- 

- 


5 


- 


6 

- 

- 


7 

- 

- 


The hyphens indicate that those table entries aren’t 
used to decode this code. Note that the symbol 
1 1 1xxx has three x-bits because of the value of 2^ in 
Index 2; it is not based on the 2^ value in Index 3. 

The SHORTED and END modes can be invoked 
simultaneously, resulting in a code such as: 

Ox 

lOx 

IlOx 

111x 

with a SHORT -2^ value = 2 (for 1 x-bit in each 
symbol) and the END bit set in Index 2. 

Packed binary fields with one to seven bits per field 
can be read using the statistical decoder by setting 
the END bit in Index 0 and by programming the X 
value to be N - 1 , where N is the number of bits per 
field. For example, packed three-bit fields could be 
decoded as shown in Table 2-8. 


Table 2-8. Packed 3-Bit 
Field Decoded Values 


Code 

Table Entries 

Index 

END Bit 

2X 

Oxx 

0 

1 

4|N = 3,soX = 2! 

1xx 

1 

- 

- 


2 


- 


3 


- 


4 


, - 


5 


- 


6 


- 


7 

- 

- 


The unpacked bits are In reverse order relative to 
how they are stored In VRAM. For example, if three- 
bit values are packed in VRAM, the pattern 110 in 
VRAM is read from right to left and gives an un- 
packed or decoded value of 3. 

The CONTROL register for the statistical decoder 
(stat'C) is used to specify the mode to use for decod- 
ing, as well as to invoke certain modes for writing 
and reading the code description table. Refer to the 
bit assignments for this register below. To write to 
the code description table, the WRITE bit (bit 4) Is 
set to a ONE; the starting table Index is reset to 
zero. Each write to the table causes the index to 
Increment by one. This Index will wrap around from 
seven back to zero. For example, to write all eight 
table entries the user would write a value of 0x10 to 
stat-c register and then write eight 8-bit values to the 
register stat-ram. The most significant bit of each 
8-blt value Is the END bit, and the lower seven bits 
are the values of 2^. To read the code description 
table, the TEST bit (bit 5) of the CONTROL register 
is set to a one. The table entries are then read from 
the decoder’s data register (*stat). Reads and writes 
always start at table entry zero. 

NOTE: 


When reading the code description table, it is 
necessary to wait one instruction time between 
the write to stat-c and the first read from *stat 
An access diagram showing all legal sequences 
for read and write FIFO registers is shown in 
Chapter 3 (Figure 3-1). 
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The- code for reading the eight table entries into the first eight locations of data RAM would be: 

dram3 = 0 stat-c = 0x20 /test mode to read the stat-ram (the table) 

cnt = 8 /wait one inst. before first read 

LOOP: 

*dram3++ = *statcnt 

jcp loop /two inst. loop necessary to wait one inst. 

/between each read from *stat. 


Bits 15 14 13 

12:8 

7 

6 

5 

4 

3 

2:0 

POL RSVD* CB 

* Reserved: write zeros to these bits. 

SVAL 

SHORT 

END 

TEST 

WRITE 

RSVD* 

Starting 

Stat-ram 

ADDRESS 


Figure 2-4. Statistical Decode CONTROL Register 


END mode is enabled by setting the END bit (bit 6) 
in the CONTROL register to a ONE. The SHORT 
mode is enabled by setting the SHORT bit (bit 7) in 
the CONTROL register to a ONE. When in SHORT 
mode, the five SVAL bits (bits 12:8) in the CON- 
TROL register are used as the SHORT -2^ value. 

The POL bit (bit 1 5) determines the polarity of the 
run-in sequence bits. If bit 15 = 0, then ONEs end- 
ing in ZERO (e.g., IIIOxxx) sequence is selected. If 
bit 15= 1, the ZEROS ending in ONE (e.g., 0001 xxx) 
sequence is selected. 

The CB bit (bit 13) allows circular buffers of sizes 
64 Kbytes, 128 Kb^es, or 256 Kbytes to be created 
in memory, as in the case of the input FIFO. The 
choice, of different sizes of buffers are determined 
by prograpiming the least significant 3 bits of the 
circular buffer register (circbuf). To enable this fea- 
ture, the CB bit has to be set to a 1 , then depending 
on the buffer size selected, the appropriate address 
pin that goes off chip will be forced to a 0 (register 
pointers remain unchanged). Table 2-3 shows the 
programming combination of the circular buffer 
register. 

The decoding parameters may be changed between 
symbols by writing to the CONTROL register and, If 
necessary, writing new values into the code descrip- 
tion table. The correct procedure for changing the 
code type or decode mode is to read the last value 
from the decoder prior to the change, using *stat# 
instead of *stat This keeps the decoder from auto- 
matically starting to decode the next symbol. At this 
point, the code description table and the SHORT 
and END mode bits can be changed as desired. The 
next time the CONTROL register is written with both 
TEST = 0 and WRITE = 0, the decoder will begin 
to decode the next symbol using the new parame- 
ters. 


word and the fetch of the next 32-bit word may over- 
lap. As with the input and output FIFOs, the decoder 
has a VRAM pointer associated with it that points to 
the location in VRAM from which it is reading data. 
This pointer Increments twice each time a new quad 
word is read; there is no decrement mode. When the 
least significant word of the decoder’s pointer (stat- 
to) is written, any data that had previously been pre- 
fetched from VRAM Is ignored, and the decoder 
fetches one quad word starting from this new loca- 
tion. 

The 82750PB assumes that the statistically encoded 
bitstream in VRAM starts with the least significant bit 
of a double word. That is, the two LSBs of the ad- 
dress written to start-lo are Ignored. 

The statistical decoder decodes data at a rate of 
one bit per T-cycle. To a first approximation, the de- 
code time for an N-bit symbol Is: 

decode time (in T-cycles) = N + 1 

Since it takes at least 64 T-cycles to decode data 
from one quad word, which is the time required fo 
eight quad word reads from VRAM, the decoder 
should rarely run out of data. Therefore, the above 
estimate should very accurately model the actual 
decoding rate of the statistical decoder. 

The statistical decoder always begins to read the 
bitstream from the least significant bit of the double 
word found at the starting location In VRAM. That is, 
the decoder does not start on a byte or word bound- 
ary as an input FIFO or output FIFO does, but only 
on double word boundaries. The bitstream moves 
from the least significant bit to the most significant 
bit of a double word and then to the least significant 
bit of the next double word (at the next higher ad- 


The statistical decoder buffers one quad word read 
from VRAM so that the decoding of bits in one 32-bit 
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dress location). For the x-bits, the first x-bit read 
from the bitstream becomes the most significant bit 
of the X-bit field when it is interpreted as a binary 
number. The example below shows a code defini- 
tion, a bitstream stored in VRAM, and the resulting 
decoded values. 

The code definition and range of values for each 
symbol length are indicated In Table 2-9. 


Table 2-9. VRAM Bitstream Decode Values 


Symbol 

Values 

Comments 

0 

0 



1.2 

100 = 1, 101 = 2 




1 1 1 Oxxx 


7-14 

1110000 = 7,..., 1110111 = 14 


Decoding starts at address 0 in this example. The 
two double words at addresses 0 and 1 are: 

0: 0XAC98E14D 
1: 0X372E74CB 

The bitstream In VRAM, with colons dividing the 
symbols (read from right to left starting at LSB of 
address 0) is shown in Figure 2-5. 

Table 2-10 lists the symbols, in the order they are 
encountered in the bitstream, and the corresponding 
decoded values. 


Table 2-10. Decoding Symbols 


Symbol 

Value 

Comments 

101 

2 

Starts at LSB, 

Address 0, 

Scanning Left 

100 

1 


101 

2 


0 

0 


0 

0 


0 

0 


0 

0 



8 


100 

1 


100 

1 


11010 

5 


1110100 

11 

Spans First and 

Second Double Word 

nool 

4 


0 

0 


1110011 

10 


101 

2 


0 

0 


0 

Q 


mono 

13 






Address MSB Read bitstream from LSB to MSB ^ — — ” LSB 

Start 

0 1:01 01 1:001:001:1 0001 1 1:0:0:0:0:1 01 :00l:l0l-<- 

I First bit of a symbol, continued at LSB of next double word 

1 0:^101 1 1 : 0: 0: : 110 01 11:0: J^OOI 1 : 001 01 1 

240854-5 

Figure 2-5. VRAM Bitstream Decoding Addresses 
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Figure 2-6. Pixel Interpolation 


Pixel Interpolator 

[Pixint-c, Pixint] 

The pixel interpolator performs bilinear interpolation 
on four 8-bit pixels to generate, In effect, a pixel 
shifted by a fraction of a pixel position. See Figure 
2-6. If the four pixels have values of A, B, C, and D; 
and the horizontal weight and vertical weight are h 
and V, respectively, the interpolated value W, ignor- 
ing any quantization effects. Is given by: 

W = A*(1-h)(1-v) + B*h(1-v) + C*(1-h)v + D*hv 

The values of h and v are even multiples of 1/16. 
Figure 2-6 illustrates pixel interpolation with an h 
weight of 6/16 or 3/8 and a v weight of 10/16 or 
5/8. 

The pixel interpolar can operate in two modes: se- 
quential-2D and random-2D. Sequential-2D mode is 
used for motion video decoding and when an array 
of pixels are Interpolated with a common weighting. 
Random-2D mode is used either when the pixel ar- 
rays to be interpolated are not adjacent pixels In two 
rows or when the weight is changed for each inter- 
polation. (The word random is used here to mean 
non-sequential.) 


The example in Figure 2-7 shows a single row of 
pixels being interpolated in Sequential-2D mode us- 
ing two rows from the original (source) bitmap. The h 
and V weighting are constant for all the interpolated 
pixels. In this case, the weights appear to be approx- 
imately h = 10/16 and v = 6/16. 


A 

B E F 1 .. 

. — First Input Row 

W 

X Y Z .. 

. — Interpolated Row 

C 

D G H K .. 

. — Second Input Row 


Figure 2-7. Sequential-2D Pixel Interpolation 


The pixel interpolator Is pipelined and requires some 
startup sequence to fill the pipeline. Once filled, the 
pixel interpolator generates a new Interpolated pixel 
every two T-cycles when in Sequential-2D mode. 
Source pixels are written into the interpolator as pix- 
el pairs. In the case above, the pixel pair BA would 
be written first, followed by the pixel pair DC. It would 
seem more natural to refer to the pixel pair as AB, 
but because of the way 8-bit pixels are arranged in 
16-bit words In VRAM, the left-most pixel on the 
screen is the least significant byte position. For ex- 
ample, if pixel A had a hex value of OxAA and B had 
a value of OxBB, the 16-bit word containing pixels A 
and B would have a value of OxBBAA. 



Then, two pixels are read from the interpolator. Be- 
cause the pipeline isn’t full yet, these pixels are read 
and discarded. This loop of writing two pixel pairs 
and reading two output pixels continues four times. 
The two pixels that are read this fourth time are the 
first two valid output pixels: W and X. The interpola- 
tor may also collect output (interpolated) pixels into 
pixel pairs. For exmple, pixels W and X, instead of 
being output separately, would be combined into a 
16-blt pixel pair XW. Since there are two possible 
phase relationships between the input pixel pairs 
and output pixel pairs, the desired phasing (either X 
and W paired or Y and X paired) can be specified. 


1-79 




82750PB 


iny. 


bits 


15 14 13 12 11 ilO 9 8 7:4 3:0 

PRESERVED— Write as ZERO 

"Pipelining Select (1 = Fast, 0 = Standard) 

"Phase (0 = In Phase, 1 = Opposite Phase) 

"RESERVED— Write as ZERO 

"Pairing (1 = Output Pixel Pairs, 0 = Single Pixels) 
"Reset Bit (1 = Reset, 0 = Normal) 

Mode Select Bits " " 


Vertical Weight " 

Horizontal Weight 


Figure 2-8. Pixel Interpolator Control Register 


Random-2D interpolation is used either when the 
pixels to be interpolated are not in horizontal rows or 
when the weight is changed for each interpolated 
pixel. Examples for this are smooth warping or 
smooth scaling operations. In the case of Random- 
2D, the processing for successive interpolated plx- 

is considered to be the first pixel of a Sequential 
mode interpolation. The weight and the two input 
pixel-pairs are written into the interpolator. After 
waiting at least 10 T-cycles, the one interpolated pix- 
el can be read. (The delay Is 1 0 cycles when in the 
standard mode (bit 14 = 0) and 6 T-cycles when in 
the fast mode (bit 14 = 1).) Then, the next two input 
pixel-pairs and if necessary, the new weight value, 
are written, and 10 cycles later the next interpolated 
pixel can be read. 

The h and v weight values, the mode selection, and 
other control bits are written to the pixel interpolator 
control register (avg-c). The bit assignment for this 
register is in Figure 2-8. The least significant byte 
holds the 4-bit v value (bits 7:4) and the 4-bit h value 
(bits 3:0). 

NOTE: 


The values used for h and v here are numerators 
of the fraction where the implied denominator is 
16 . 


MODE SELECT 

Bits 8 and 9 are used to select on of four operating 
modes, of which only two are presently defined. 
These modes are given In Table 2-11. 


Table 2-11. Mode Select Operating Modes 


Bits 9:8 

Mode 

00 

RANDOM-2D 

01 

Sequential-2D 

10 

RESERVED 

11 

RESERVED 


RESET 

Writing a ONE to bit 10 resets the pixel interpolator. 
The pixel interpolator must be reset prior to chang- 
ing modes. 

PAIRING 

A ZERO In bit 11 causes the pixel interpolator to 
output individual pixels. A ONE causes the Interpola- 
tor to collect adjacent pixels (in Sequential-2D 
mode) into 1 6-bit pixel pairs. This feature assists in 
motion video decoding, when combined with the 
ALU’s dual-add-with-saturate operation, by allowing 
two pixels to be processed each cycle. The phasing 
used in collecting the pixel pairs Is determined by the 
Phase bit described below. 


PHASE 

When output pixels are collected into pixel pairs, 
there are two possible alignments of the input pixel 
pairs to the output pixel pairs. The Phase bit (bit 1 3) 
selects the alignment to be used, based on the rela- 
tive word alignment of the source and destination 
bitmaps in VRAM. When the Phase bit is set to a 
ZERO, this indicates that the bitmaps are in-phase. 
In this case, the first two output pixels are grouped 
into one 16-bit pixel pair (with the first pixel in the 
least significant byte). When the Phase bit Is set to a 
ONE, the bitmaps are out-of-phase. In this case, the 
first pixel Is placed in the most significant byte of the 
first pixel pair, with invalid data in the least significant 
byte, and the second and third output pixels are col- 
lected into the second pixel pair. This is illustrated in 
Figure 2-9. 

PIPELINING 

A ZERO in bit 1 4 causes the pixel interpolator to use 
the standard amount of pipeline dfelay. A ONE In this 
field will select the fast mode that has less pipeline 
delay. Table 2-12 shows the pipelining delay for both 
modes. Note that the effect of the phase bit is to add 
an extra pixel delay. 
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In-Phase: 
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2nd Row of Input Pixel Pairs 


Figure 2-9. Pixel Pair Phases 



Table 2-12. Pipelining Delay for 
Sequential-2D NON-PAIR Mode 


Pipelining 

Bit 

(Bit 14) 

Phase 

Bit 

(Bit 13) 

Pipeline Delay 
in Output 
Pixels 

0 

0 

6 

0 

1 

7 

1 

0 

2 

1 

1 

3 


When in PAIR mode (with bit 11 = one), the amount 
of pixel delay does not change, but half as many 
reads, and writes are required to fill the pipeline be- 
cause each read or write of the averager transfers 
two pixels. For example, when In the standard mode 
(bit 14 = 0), with zero phase (bit 13 = 0) and pair 
mode (bit 11 = 1), three indeterminate pixel pairs 
must be read before the first good pixel pair is read. 
In the same case but with the phase bit = 1, the 
fourth pixel pair read contains one good pixel and 
one indeterminate pixel, and the fifth pixel pair read 
contains two good pixels. 

RESERVED 

Bits 15 and 12 are reserved for future use. Write 
ZEROS into these bit positions. 


Signature Register 

[hwid] 

The signature register can be read either by the host 
CPU or by microcode to determine the version of the 
82750PB. The value of the signature register can be 
used to distinguish between the 82750PB in the 


82750PA emulation mode, and the 82750PB in na- 
tive mode. The currently defined signature values 
given in Table 2-13. 


Table 2-13. Signature Values 


Value 

Definition 

OxFFFE 

The 82750PB Emulating the 82750PA 

OXFFFC 

The 82750PB in Native. Mode 


All other signature values are presently undefined 
but may be used in the future to denote other ver- 
sions of the 82750 architecture. 


Dispiay Format Registers 

[yeven, yodd, vu, vptr] 

The 82750PB’s processor can write to the display 
registers in the VRAM interface. These registers are 
pointers and pitch values that address display bit- 
maps and 82750DB register loads in VRAM. Point- 
ers are 32-bit values that specify the specify the 
starting byte address of a bitmap or register load 
within a 4 GByte address space. The bottom two 
address bits are ignored since display bitmaps and 
register loads must start on a double word boundary. 
Therefore, the internal representation of a pointer is 
a 30-bit value. The pitch value associated with each 
pointer indicates the number of bytes between the 
start of two lines of a display bitmap or between the 
start of two register loads. The pitch is a single 1 6-blt 
value with its two least significant bits ignored, since 
the pitch must be an integer number of double 
words. Currently, there is also a restriction in the 
82750DB limiting all display bitmap pitches to pow- 
ers of two; so, the maximum display bitmap pitch is 
±214 Bytes = ±16 kBytes. The display registers 
are described in Table 2-14. 
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Table 2-14. Display Registers 


Register 

Description 

yeven-lo, hi 

This register pair points to the start of the Y bitmap or main bitmap that 
is to be displayed during an even field scan. 

yodd-lo, hi 

This register pair points to the start of the Y bitmap or main bitmap that 
is to be displayed during the odd field scan. 

ypitch 

The value in this register is added to the current Y bitmap pointer value 
each time a Y transfer Is performed. 

vu-lo, hi 

This register pair points to the start of the VU bitmap. This bitmap is 
read to generate the VU values for both odd and even field scans. 

vupitch 

This value Is added to the current VU bitmap pointer value each time a 

VU transfer is performed. 

vptr-lo, hi 

This register pair points to the start of a series of 82750DB register 
loads stored in VRAM. 

vpitch 

This value is added to the current 82750DB register load pointer each 
time a 82750DB register load is performed. The pitch Is equal to the 

number of bv'fes from tho start of on© rsgistG.^ load to tho start ui me 

next register load. 


3.0 HARDWARE INTERFACE 


VRAM Interface 

The VRAM Interface performs the following opera- 
tions: 

• Maintains VRAM pointers for the two input FIFOs, 
the two output FIFOs, the statistical decoder, the 
Y (main) bitmap, the VU bitmap, and the 
82750DB register load. 

• Decodes VBUS codes and takes appropriate ac- 
tions such as generating a transfer cycle, sched- 
uling refresh cycles, or generating interrupt condi- 
tions. 


® Arbitrates VRAM accesses between the two input 
FIFOs, the two output FIFOs, the statistical de- 
coder, the transfer request logic, the VRAM re- 
fresh logic, and the external VRAM access logic. 

® During a memory cycle, performs appropriate ad- 
dress arithmetic on the VRAM pointer used for 
that memory cycle. 

• As a result of certain VBUS codes, performs a 
shadow copy that consists of copying display-re- 
lated VRAM pointer values from shadow registers 
(that are loaded by the host CPU or the micro- 
code processor) to working registers where the 
various pointers are used for transfer cycles 
when the 82750DB is refreshing the display 
screen. 


Table 3-1. VRAM Interface Signals 


Signal 

Description 

MREQ# 

MEMORY REQUEST Is asserted during the first cycle of a VRAM 
memory access. 

TRNFR# 

The TRANSFER output indicates the current memory cycle is a result 
of a 82750DB transfer request. 

RFSM# 

The REFRESH output indicates the current memory cycle is a result of 
a 82750DB refresh request. 

NXTFST# 

The NEXT FAST output indicates the next memory access will use the 
same row address as the current memory access. This facilitates the 
use of page mode memory accesses. 

MRDY# 

■ 

The MEMORY READY input indicates the availability of valid data on 
the D[31:0] pins. 
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VRAM ACCESSES 

The 82750PB can initiate five different types of 
memory accesses: FIFO read, FIFO write, transfer 
read, transfer write, and refresh. In addition, the 
82750PB supports VRAM accesses by external log- 
ic. During an external access VRAM cycle, the 
82750PB tri-states its VRAM address and data bus- 
es and performs a host VRAM read or host VRAM 
write cycle. There is another operation performed by 
the 82750PB, a shadow copy, that is not a VRAM 
cycle but Is arbitrated as though It were, since no 
VRAM cycles can take place during a shadow copy. 

The seven types of VRAM cycles initiated by the 
82750PB, including host VRAM read and host 
VRAM write, begin with the 82750PB asserting a 
combination of its three VRAM cycle definition out- 
puts: TRNFR#, RFSH#, and WE#. External logic 
detects the state of these signals, validated by 
MREQ#, and produces the appropriate sequence of 
VRAM control signals (RAS, CAS, etc.) to perform 
the type of memory cycle the 82750PB has request- 
ed. The 82750PB requires that each of these VRAM 
cycles take a minimum of two T-cycles, or T-states, 
denoted T1 and T2. External logic can Insert addi- 
tional T2 states in order to stretch the VRAM cycle 
to more than two T-cycles. The start of a new VRAM 
access cycle Is signaled by the assertion of MREQ# 
for the first T-cycle, T1. The VRAM access cycle 


definition signals, TRNFR#, RFSH#, and WE#, are 
asserted at the start of T 1 and remain asserted until 
the end of the last T2. Other VRAM operations can 
be described similarly by sequences of T-states. Re- 
fer to Figure 3-4 and 3-5 on page 42 for timing dia- 
grams. 

Table 3-2 defines the states used for all VRAM ac- 
cess operations. A state diagram for the VRAM/ 
Host Interface Is provided in Figure 3-1. This dia- 
gram includes the FIFO access states 


Table 3-2. 82750PB VRAM Access States 


State 

Description 

Ti 

Idle State, No VRAM Activity 

T1,TF1 

First State of a VRAM FIFO Cycle 

T2, TF2 

Last State of a VRAM FIFO Cycle 

TSC 

The T-State required to perform a 
shadow copy 

TTX1 

First State of a VRAM Transfer Cycle 

TTX2 

Last State of a VRAM Transfer Cycle 

TRF1 

First State of a VRAM Refresh Cycle 

TRF2 

Last State of a VRAM Refresh Cycle 
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Note that during successive VRAM cycles it is not 
necessary to go back to the idle state, Ti, between 
each cycle; the If 2 state can be followed directly by 
a T1 state, starting at the next VRAM cycle. This 
results in efficient utilization of the 82750PB/VRAM 
bandwidth by allowing a VRAM cycle time of 2 
T-states. 


FAST VRAM CYCLES 

When the 82750PB performs Data Read or Data 
Write VRAM cycles for the input or output FIFOs, it 
performs two 32-bit accesses to read or write one 
64-bit value. These accesses are always performed 
in a sequence of EvenAddress followed by EvenAd- 
dress + 1 , which guarantees both that the two se- 
quential accesses will be in opposite banks and that 
the two accesses will be within the same VRAM 
page. This allows external logic to use either bank- 
interleaving or a page-mode access to complete the 
second Bcces-S of the .Q<3r!!!0n0p 3nd Improve the 
VRAM bandwidth. However, the second access 
does not need to be handled differently from the 
first. Except for the assertion of the NXTFST # sig- 
nal, both accesses are treated as standard VRAM 
accesses. External logic can ignore the NXTFST # 
signal, though, and treat the two accesses as two 
normal data read or data write cycles. Note that 
NXTFST# is not asserted for transfer, refresh, or 
host memory accesses. 


The NXTFST # output signal is provided for cases 
when external logic can generate a faster access for 
the second access of the two sequential accesses. 
During such a pair of accesses, NXTFST # is assert- 
ed during the first of the two accesses in order to 
provide sufficient time for the external logic to gener- 
ate the appropriate fast memory cycle for the sec- 
ond access. Refer to the timing diagrams in Figures 
3-4 and 3-5 (page 42) for examples illustrating the 
use of the NXTFST # signal. 

VBUS CODES 

Transfer request, interrupt, and synchronization 
codes are sent over the BUS from the 82750DB to 
the 82750PB. The codes recognized by the 
82750PB are listed in Table 3-3, along with the ac- 
tions taken by the 82750PB as a result of receiving 
each code. Codes that cause TRANSFER cycles 
must be asserted for at least two clock cycles of the 
S2750PB to insure that, in the worst case, the 
82750PB completes the transfer cycle before the 
code is released and the 82750DB starts shifting 
data from the VRAM shift registers. Other codes 
must also be asserted for a minimum of two 
82750PB clock cycles. Only the codes given in the 
Table 3-3 are valid codes for the VBUS. Other codes 
are reserved for future use and should not be used. 
Once a transfer cycle code is sent to the 82750PB, 
any non-transfer code may be sent immediately. A 
subsequent transfer cycle code should be sent only 
after the current transfer cycle is completed. 
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Table 3-3. VBUS Codes 


Binary 

Name 

Action 

0000 

YBMX 

TXRD Cycle Using Yc; Yc = Yc + Yp* 

0001 

VUBMX 

TXRD Cycle Using VUc; VUc = VUc + VUp 

0010 

REGX 

TXRD Cycle Using Vc; Vc = Vc + Vp 

0011 

WRDIGX 

TXWR Cycle Using Yc; Yc = Yc + Yp 

0100 

YNPBMX 

TXRD Cycle Using Yc; Yc = Yc 

0101 

Reserved 

Reserved 

0110 

Reserved 

Reserved 

0111 

WRDIGNPX 

TXWR Cycle Using Yc; Yc = Yc 

1000 

DFL 

DFL Int; Shadow Copy** 

1001 

82750DBSD 

82750DB Shutdown Interrupt 

1010 

REFRESH 

Schedule N Refresh Cycles 

1011 

Reserved 

Reserved 

1100 

VODD 

VBI Int; OF Int; Shadow Copy Odd; Hline = 0*** 

1101 

VEVEN 

VBI Int; EF Int; Shadow Copy Even 

1110 

HLINE 

lcnt+ 4- (Increment Line Counter) 

1111 

NULL 

No Action 



NOTES: 

*Yc — Y bitmap pointer, current; Yp — Y bitmap pitch; VU— VU bitmap; V — 82750DB register load. 
**Shadow Copy with Yc = Y-start-odd in odd field; Yc = Y-start-even in even field. 

***Hline — Horizontal Line Counter. 


PRIORITY 

Each time the VRAM state machine completes a 
VRAM operation and returns to the Ti state, it exam- 
ines all pending VRAM access requests and selects 
the highest priority request for the next VRAM oper- 
ation. The priority ordering of these requests are list- 
ed in Table 3-4. 


Table 3-4. Priority of VRAM Operations 


Request Type 

Priority 

Transfer Cycle 

Highest 

Shadow Copy 

• 

Host Access 

• 

VRAM Refresh 

• 

FIFO Read/Write 

Lowest 


NOTE: 

The shadow copy is treated as a VRAM operation even 
though it does not result in an access to VRAM. 


The VRAM refresh operation is placed low on the 
priority list to reduce the latency in servicing transfer 
requests and external VRAM requests. Since a sin- 


gle REFRESH code from the 82750DB schedules a 
number of refresh cycles, a higher priority for refresh 
would cause all the refresh cycles to occur In a burst 
that would lock out all lower priority requests until all 
refresh cycles completed. Instead, the following 
restriction applies to all request types with higher 
priority than refresh: high priority requests, such as 
transfer cycles, shadow copies, and external VRAM 
access must occur infrequently enough to allow 
proper refresh of the VRAM chips. Transfer cycles 
and shadow copies, by their nature, occur infre- 
quently so they are not generally a problem. 

There is a separate priority scheme for the five FIFO 
channels. The scheme used is rotating priority with 
automatic override and single cycle arbitration. Ro- 
tating priority means that the priority is assigned in a 
fixed cyclic order with the lowest priority given to the 
FIFO channel that “won” the last FIFO access. 
There Is only one level of memory , so the order that 
requests arrive Is not a factor in the arbitration. The 
cyclic order Is given in Figure 3-2. 

As an example. If input FIFO 0 (abbreviated ifO) was 
the last channel to perform a cycle, the priority order 
for the next FIFO access (from highest to lowest) 
would be: if 1 , sd, ofO, of1 , and ifO. 
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Automatic override that the rotating cyclic priority 
can be bypassed if there is an URGENT condition 
for one of the channels. A channel is urgent if the 
microcode processor is frozen because the proces- 
sor is waiting for that channel to be ready. The chan- 
nel can be either an input channel that is empty or 
an output channel that is full. In this case, the urgent 
channel gets the next available cycle. However, the 
priority will still be lower than non-FIFO requests, 
such as refresh cycles. 

Single clock cycle arbitration means that the selec- 
tion of the next channel that will get an access oc- 
curs in a single T-cycle or T-state, either in a Ti state 
or during the last T2 state of the previous VRAM 
cycle. 

VRAM POINTERS 

The VRAM interface maintains VRAM pointers for 
the FIFOs, as wsl! as display-related pe-i.nters for the 
82750DB. Internally each pointer or address is 
stored as a 30-bit value addressing a double word in 
VRAM. The pointer values are read and written as 
two 1 6-bit words representing a 32-bit byte address 
(refer to the Figure 3-3). With a 30-blt double word 
address, the 82750PB can decode a VRAM address 
space of 1G double words or 4 GBytes. 

Input and output FIFOs can address down to a sin- 
gle word or byte in VRAM. A FIFO’s pointer is post- 
incremented or post-decremented in parallel with its 
VRAM read or write cycle. 

The statistical decoder can only start decoding bit- 
streams on double word boundaries in VRAM and 
can only Increment through VRAM. The decoder’s 
pointer is post-incremented in parallel with each of 
its VRAM read cycles. 

Display-related pointers are updated by adding a 
pitch value to the current value during the corre- 
sponding transfer cycle. 


If a VRAM pointer appears on the B-Bus as source 
or as a destination then the following rules apply: 

Rule 1 

If a B-Bus destination refers to an address that Is 
both Even and >0x1f, then the source is restricted 
to “-lo” pointers If the source refers to a pointer. 

Rule 2 

If a B-Bus destination refers to an address that is 
both Odd and >0x1f, then the source is restricted to 
“-hi” pointers if the source refers to a pointer. 

SHADOW COPY 

When a VODD, VEVEN, or DFL code is received 
from the 82750DB over the VBUS, a shadow copy is 

cohiaH|ijpH prit!!?*! Vslll OCCUr SS 

soon as the priority logic allows. Any VRAM access 
in progress must complete and a pending transfer 
cycle, if any, must be performed before the shadow 
copy can start. During the operation, shadow regis- 
ters for the Y-START, Y-PITCH, VU-START, VU- 
PITCH, 82750DB-START, and 82750DB-PITCH are 
copied into the corresponding working registers. 
During display refresh, the address arithmetic is per- 
formed on the working registers. The shadow regis- 
ters can be loaded by the host CPU or by a micro- 
code routine with less critical timing constraints, and 
then copied instantly by a shadow copy with it is time 
to update the registers, either prior to the next field 
or during the active display for split screen effects. 


inFlFO 1 


InFIFOO - 


outFlFO 1 • 


outFIFOO • 


Statistical Decoder 


240854-8 


Figure 3-2. Cyclic Ordering of FIFOs 


31 30 29 24 23 16 15 3 2 1 0 

< VRAM Address 30 bits - - - > 

Byte Address within Double-Word < — > 

<-MostSig. Word of VRAM Address. -> | <- Least Sig.Wd. of VRAM Addr.-> 


Figure 3-3. VRAM Addressing 
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There are actually two shadow registers for Y- 
START. One for start of odd fields and one for start 
of even fields. A VODD code causes Y-START-ODD 
to be copied into the working register Y-CURRENT. 
Similarly, a VEVEN code causes the Y-START- 
EVEN to be copied into Y-CURRENT. A DEL code 
causes the Y-START-ODD value to be copied if the 
most recent start of field code received is a VODD, 
or a Y-START-EVEN value if the most recent start of 
field code was a VEVEN. This allows a simple inter- 
laced or non-interleaced display to be refreshed with 
no host CPU intervention. For more complex dis- 
plays, such as split screens, the host CPU must up- 
date the shadow registers prior to each shadow 
copy. A shadow copy operation requires 2 T-cycles. 


Host Interface 

The Host Interface provides the following functions; 

• Arbitrates host CPU and 82750PB access to 
VRAM. 

• Provides the host access to external devices. 

• Provides the host access to 82750PB internal 
registers and memories. 

Signals specific to the Host Interface are listed in 
Table 3-5. 


Table 3-5. Host Interface Signals 


Signal 

Description 

HREQ# 

HOST REQUEST: Asynchronous request from the host for all types of 
host access. Used both to request and release system buses. 

HREG# 

HOST REGISTER: Single-ranked control to request host access to 

82750PB internal registers in concert with HRAM#. 

HRAM# 

HOST VRAM: Single-ranked control to request host access to VRAM in 
concert with HREG # . 

HALEN# 

HOST ADDRESS LATCH ENABLE: Asynchronous status from the host 
indicating the presence of valid address, write enable (transaction 
direction control), and the byte enables at the Interface of the 82750PB. 

HBUSEN# 

HOST BUS ENABLE: 82750PB synchronous status granting the host 
access to the address, write enable, data bus, and byte enables at the 
interface of the 82750PB. 

HRDY# 

HOST READY: 82750PB synchronous status to the host indicating the 
presence of valid data appearing at the 82750PB’s databus for VRAM 
and register accesses and optionally for external accesses. 

HINT# 

HOST INTERRUPT: 82750PB synchronous interrupt to the host, set 
under direct or indirect microprogram control. 


Signals common to the host, VRAM, and external device interfaces are listed in Table 3-6. 


Table 3-6. Host, VRAM, and External Device Interfaces 


Signal 

Description 

A[31:2] 

ADDRESS BUS: System address bus used to select unique VRAM, the 

82750PB register, and external device locations that will be accessed 
under host control. The lower seven bits A[8:2] are bidirectional and are 
used during register accesses 

D[31:0] 

DATA BUS: Bidirectional system data bus used to transfer data to and 
from all sources and destinations. When transferring 1 6-bit host register 
values, the data bus MSH and LSH will both carry identical values. 

WE# 

WRITE ENABLE: Bidirectional, single-ranked signal used to determine 
the data transfer direction. When active during host register cycles, data 
flows from the host to an 82750PB destination. During host VRAM cycles, 

WE# active will define the data direction to be from the host to VRAM. 

BE[3:0]# 

BYTE ENABLE: Bidirectional signals used to select the bytes that will be 
modified during data transactions. All host register transactions are 
performed 1 6 bits at a time, while VRAM may be modified 8 bits at a time. 
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As with VRAM operations, host operations are described through a sequence of T-states. Table 3-7 defines 
the T-states used to implement all host transactions with VRAM, external devices, and the 82750PB. 

The master execution state diagram that defines the VRAM/Host transactions is provided in Figure 3-1. 


Table 3-7. 82750PB Host Transaction States 


State 

Description 

TA 

First state of any host transaction. Entry into TA will be granted after 

HREQ# has been asserted. During this state, the 82750PB will tri-state 
its address, data bus, write enable, and byte enable signals to provide a 
full cycle of “dead-band” before the assertion of HBUSEN#. In the state 
immediately following TA HBUSEN# will assert, allowing the host to drive 
the host buses. 

TB 

First cycle in which the host is granted bus access for register or VRAM 
transactions. The sequencer will remain in TB until HALEN# is received, 
indicating that the address write enable and byte enable signals are 
stable at the 82750PB pins. 

TCI 

First cycle that output data is valid. 

TCn 

This state is entered to wait for the completion of the current host cycle. 

The cycle is defined as complete when HREQ# deasserts. HRDY# is 
asserted along with valid data until the transition to state TD occurs. 

TD 

The last cycle of a host transaction. HBUSEN# is deasserted allowing 
one dead-band cycle to allow control of the address, data, write enable, 
and byte enable signals to be returned to the 82750PB. 

TV1 

First cycle of a Host VRAM transaction. Memory is requested and is 
followed by a transition to TV2. 

TV2 

Last cycle of a Host VRAM transaction. The sequencer will remain In TV2 
until MRDY# is received. 


A single stage of Input synchronization is employed 
for HREG#, HRAM#, WE#, and BE[0]#, while 
HREQ# and HALEN# are programmable to have 
one or two stages by bit 1 2 of the Microcode Proc- 
essor Control Register; See Table 3-10. T-state tran- 
sitions are caused by the synchronized versions of 
these signals. 

The synchronized versions of HREG# and HRAM# 
must be stable before entry into T-state TA. The 
synchronized versions of WE#, BE[0]#, and 
HALEN# should be stable before exiting T-State 
TB. Once asserted, all of the above signals should 
remain stable until the deassertion of HBUSEN#. 

The type of host cycle to perform is determined by 
the states of HREG# and HRAM# as indicated in 
Table 3-8. 


Table 3-8. Host Cycle Types 


HREG# 

HRAM# 

Host Cycle 
Type 

1 

1 

External 

0 

1 

Register . 

1 

0 

VRAM 

0 

0 

1 

Reserved 


HOST REGISTER ACCESS 

The host has access to the 82750PB’s internal reg- 
isters and memories to monitor and control the oper- 
ation of the microcode processor, provide a means 
of debugging microprogram routines, and to function 
as the primary test port for production testing. 

Register access is initiated by the host asserting 
HREQ#, HREG#, and HRAM# as shown in Table 
3-8 and in the timing diagrams on pages 42 through 
45. After the host has been granted bus access by 
an active HBUSEN# in state TB, the address, write 
enable, and byte enables may be driven. After these 
signals have stabilized HALEN# Is asserted, en- 
abling a read or a write operation to occur. 
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In the case of a register read, state TC1 is entered 
and the data bus is driven with the internal value. 
One cycle later, a transition to state TC occurs, and 
HRDY# activates, signaling the presence of stabi- 
lized data at the 82750PB data pins. This state (TC) 
will be maintained until the host deasserts HREQ#, 
signaling the completion of the cycle that caused a 
transition to state TD. 

In the case of a register write, TC1 is again entered 
(from TB), but the data bus may now be driven by 
the host. (During host cycles, data bus drive activity 
is indirectly controlled by WE# and an additional 
dead-band is provided by entry into state TC1 to al- 
low for internal WE# stabilization.) Stable data at 
the 82750PB interface, as well as the completion of 
the write cycle, is signaled by the deassertion of 
HREQ#. As with reads, the deactivation of HRDY# 
signals the transition to state TD. 

As state TD is entered, HRDY# and HBUSEN# 
deassert, the address data, write enable, and byte 
enables tri-state, and bus control is returned to the 
82750PB in the following cycle. 

HOST VRAM ACCESS 

Because the 82750PB Is so closely coupled with 
VRAM, host accesses to VRAM are arbitrated and 
controlled by the 82750PB. VRAM access is initiated 
by the host asserting HREQ#, HREG#, and 
HRAM# as shown in the Host Cycle Table above 
and in the timing diagrams on pages 42 through 45. 
After the host has been granted bus access by an 
active HBUSEN#, the address, write enable, and 
byte enables may then be driven. After these signals 
have stabilized at the memory devices (or longest 
relevant propagation path), HALEN# is asserted, 
enabling a read or a write operation to occur. 

Because VRAM will not drive the data bus until after 
a memory request, a transition into state TC1 to al- 
low for data bus direction stabilization is not re- 
quired. Instead, a transition to state TV1 occurs, 
which asserts MREQ# for a single cycle and Is fol- 
lowed by a transition to TV2. TV2 will remain the 
current state until the reception of an active 
MRDY#. 

In the case of a VRAM read, the memory data bus 
will be driven during TV1, and valid data will appear 
in state TV2. Data will be guaranteed valid coinci- 
dent with the deassertion of MRDY# from memory. 

In the case of a VRAM write, the memory data bus Is 
driven with valid data during TV1. Again the recep- 
tion of MRDY# will serve to indicate the completion 
of the memory operation. 


NOTE: 

The host device must be able to transmit or receive 
memory data in order to be valid at the trailing 
edge of MRDY# at the data’s destination (memory 
or host). 

After MRDY # becomes active, a transition from TV2 
Into TC1 is accomplished to allow time to propagate 
data to the host. TC is then entered to await the 
deassertion of HREQ# (if it has not already oc- 
curred). TD is then entered, duplicating the dead- 
banding previously described. 

HOST EXTERNAL ACCESS 

In addition to VRAM and register host access, an 
external device access mechanism is provided. Dur- 
ing this access, upon the receipt of HREQ# with 
HREQ# and HRAM# inactive, the 82750PB releas- 
es the address, data, write enable, and byte enables 
in state TA. 

The difference here Is that state TC1 is directly en- 
tered from TA, thereby ignoring any transitions of 
HALEN#. Since the 82750PB also ignores the data 
bus direction control (write enable) the host and an 
external device may communicate unencumbered 
by the 82750PB. 

Entry into state TC directly follows TC1 in the ex- 
^ pected sequence and remains there until HREQ# is 
released. This is followed by entry into TD. 
HBUSEN# is asserted during the timing that TC1 
and TCN are active. 

During an external access, HRDY# is not asserted 
unless the external logic asserts MRDY# as shown 
in Figure 3-7. 

HOST REGISTER ADDRESS MAPPING 

Table 3-9 shows the host address mapping of the 
on-chip registers and memories, in terms of the off- 
set in bytes, from the base address for 82750PB 
accesses. Note that the 82750PB only supports 
word accesses to these registers. Therefore, the 
least significant bit of the byte offset should be set to 
zero. The 82750PB forms the register address from 
inputs on the A [31:2] pins and BE #[3:0] pins. The 
A[31:2] specify the double word address of the reg- 
ister, and combinations of the BE# pins determine 
'which of the two words with the double word is being 
addressed. BE# [3:0] = IIOO 2 selects the least sig- 
nificant word within a double word, and BE# [3:0] = 
0011 2 selects the most significant word within a 
double word. These are the only two valid patterns 
for BE# inputs during a host register access cycle. 
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Table 3-9. Host Address Mapping 


Byte 

Address 

Description 

0x000-0x07E 

0x080-0x0FE 

0x1 00-0x1 7E 

0x1 80-0x1 FE 

(a) A source and 
destination registers 

(b) B source and 
destination registers 

(c) Microcode processor control 
and status registers 

(d) VRAM pointer RAM 


NOTE: 


The host should only perform 16-bit word reads 
or writes to 82750PB registers. The 82750PB 
does not support byte reads or writes or double 
word reads or writes to on-chip registers. 


When the host CPU reads or writes to areas (a, b, or 
d) and the 82750PB is not already in a HALT state, 
the microcode processor is automatically HALTED 
for the one T-cycle actually required to complete the 
data transfer, and then the processor Is restarted 
after the transfer is complete. If the 82750PB Is in a 
HALT state when the host access is initiated, it will 
remain in the HALT state following the completion of 
the access. This is transparent to both the host CPU 
and the microcode processor. 


During an access to areas (a) or (b), bits 6:1 of the 
byte offset should be set to the source or destina- 
tion code for the register that will be read or written. 
The coding is the same as used in the microcode 
instruction word. Bit 0 is always set to a zero. Refer 
to the 82750PB Source and Destination Coding 
Table found In Chapter 4. 

Area (c) contains one write-only register, the CON- 
TROL register, and two read-only registers, the IN- 
TERRUPT FLAG register and the microcode PROC- 
ESSOR STATUS register. The CONTROL register is 
used to halt or single-step the microcode processor, 
which enables or masks interrupts to the host CPU, 
selects the signal that Is output via the PMON/FRZ 
pin, and enables or disables the 82750PA emulation 
mode. The bit assignments for the CONTROL regis- 
ter are given In Table 3-10. 

During reset of the 82750PB, the HALT bit is set to a 
one, the six Interrupt Enable bits are reset to zero, 
the Disable SYNC bit is set to zero, the PMON/FRZ 
bit Is set to zero (so that the FRZ signal is output), 
and the Enable 82750PB bit is reset to zero (so that 
on reset, the 82750PB starts in a 82750PA emula- 
tion mode). 


1-90 





82750PB 


■ny. 


Table 3-10. Bit Assignments for Microcode Processor CONTROL 
Register I Write-Only, Byte Offset = Ox 100 1 


Bit 

Name 

Description 

BitO 

HALT 

1 = Microcode Processor Halt 

0 = Microcode Processor Run 

Biti 

SINGLE-STEP 

1 = Execute One Instruction and then Halt 
(Only when Already Halted, Bit 0 = 1) 

0 = No Action 

Bit 2 

Enable MCINT 

1 = Enable Microcode Interrupts to Host CPU 

0 = Mask Microcode Interrupts 

Bits 

Enable VBI 

1 = Enable Vertical Blanking Interrupt to Host CPU 

0 = Mask Vertical Blanking Interrupt 

Bit 4 

Enable DFL 

1 = Enable DFL Interrupt to Host CPU 

0 = Mask DFL Interrupt 

Bits 

Enable SD 

1 = Enable 82750DB Shutdown Interrupt to Host 

0 = Mask SD Interrupt 

Bite 

Enable OFI 

1 = Enable Odd Field Interrupt 

0 = Mask OF Interrupt 

Bit? 

Enable EFI 

1 = Enable Even Field Interrupt 

0 = Mask EF Interrupt 

Bits 8-11* 


1 = RESERVED; Write as Zeros 

Bit 12 

Disable SYNC 

1 = Disable Synchronizers for HREQ#/HALEN# 

0 = Enable Synchronizers for HREQ#/HALEN# 

Bit 13 

PMON/FRZ 

1 = Output FRZ # Signal on PMFRZ # Pin 

0 = Output PMON # Signal on PMFRZ # Pin 

Bit 14 


1 = RESERVED; Write as Zero 

Bit 15 

Enable 82750PB 

1 = Enable 82750PB Mode 

0 = Enable 82750PA Emulation Mode 


*AII other bits are reserved for future use, and should be written as zeros. 
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The INTERRUPT FLAG register holds a flag for 
each of the six interrupt sources. A flag bit is set to a 
one when the Interrupt condition is detected (inde- 
pendent of the state of the corresponding Interrupt 
Enable/Mask bit in the CONTROL register), and all 
flags are cleared to zero each time the INTERRUPT 
FLAG register Is read. If this register is read during 
the same cycle that an interrupt condition is detect- 
ed, the flag bit corresponding to that interrupt condi- 
tion will remain at a one. This new interrupt condition 
will then be seen by the host processor when It next 
reads the INTERRUPT FLAG register. The flag In- 
sures that an interrupt Is not lost if it occurs at the 
same cycle that the INTERRUPT FLAG register is 
read (and reset). In addition, the Microcode Interrupt 
source has an overflow flag that indicates if more 
than one Microcode Interrupt has occurred since the 
Interrupt Flag register was last read. The bit assign- 
ments for the INTERRUPT FLAG register are listed 
in Table 3-11. 


The PROCESSOR STATUS register holds four 
status bits: HALT, FREEZE, PMON, and SYNC 
status. HALT Indicates that the processor Is HALT- 
ED due to a HALT bit In the CONTROL register be- 
ing set to a ONE or due to the HALT# pin being 
asserted. FREEZE Indicates that the processor is 
waiting for one of the VRAM channels to become 
ready or is waiting for an access to the VRAM point- 
er RAM. PMON Is a signal that can be toggled by a 
special ALU opcode or a special B source code. 
This signal can be used for performance monitoring 
of microcode. SYNC status bit Indicates the pres- 
ence or absence of the Internal synchronizers for 
HREQ# and HALEN# inputs. In addition, the Inter- 
rupt Mask bits that are written into the PROCESSOR 
CONTROL register can be read from this register. 
These mask bits are read in the same polarity that 
they are written, but note that the bit positions and 
bit ordering are not consistent with the PROCES- 
SOR CONTROL register. The bit assignments for 
this register are given in Table 3-12. 

Address mapping for areas (a), (b), and (d) are given 
in Tables 3-13 to 3-15. 


Table 3-11. Bit Assignments for INTERRUPT FLAG Register 
(Read-Only, Byte Offset = 0x100) 


Bit 

Description 

Bit 8:0 

Not Used, the State of These Bits Are Not Specified 

Bit 9 

EF Interrupt Flag 

Bit 10 

OF Interrupt Flag 

Bit 11 

MCINT Overflow Flag 

Bit 12 

82750DB Shutdown Interrupt 

Bit 13 

MCINT Microcode Interrupt 

Bit 14 

VBI Vertical Blanking Interrupt 

Bit 15 

DFL Display Format Load Interrupt 
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Table 3-12. Bit Assignments for PROCESSOR STATUS Register 
(Read-Only, Byte Offset = 0x102) 


Bit 

Description 

BitO 

HALT (1 = Halted, 0 = Running) 

Bit1 

FREEZE (1 = Frozen, 0 = Running) 

Bit 2 

PMON (1 = Active, 0 = Inactive) 

Bit 3 

Synchronizers on HREQ#/HALEN# (0 = Enabled, 1 = Disabled) 

Bit 9:4 

Not Used, the State of These Bits is Not Specified 

Bit 10 

MCINT Microcode Interrupt Mask 

Bit 11 

VBI Vertical Blanking Interrupt Mask 

Bit 12 

DFL Display Format Load Interrupt Mask 

IBIQEHH 

82750DB Shutdown Interrupt Mask 


OF Interrupt Mask 


EF Interrupt Mask 
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Table 3-13. 82750PB A Bus Source/Destination Address Mapping 


Address (Hex) 

ADST 

ASRC 

0x000 

Null 

Null 

0x002 


hwid 

0x004 


cc 

0x006 

maddr 


0x008 


alu 

OxOOA 

cnt' 

cnt 

OxOOC 

cnt2 

cnt2 

OxOOE 

lent 

lent 

0x010 

rO 

rO 

0x012 

r1 

r1 

0x014 

r2 

r2 

0x016 

r3 

r3 

0x018 

r4 

r4 

0x01 A 

r5 

r5 

0x01 C 

r6 

r6 

0x01 E 

r7 

r7 

0x020 

mcode3 

mcode3 

0x022 

mcode2 

mcode2 

0x024 

mcodel 

mcodel 

0x026 

pc 

pc 

0x028 

pixint-c 


0x02A 

pixint 

pixint 

0x02C 

*dram1 

*dram1 

0x02E 

*dram2 

*dram2 

0x030 

*dram1 + + 

* dram i + + 

0x032 

*dram2+ + 

*dram2+ + 

0x034 

*dram1 — 

*dram1 — 

0x036 

*dram2 — 

*dram2 

0x038 

drami 

drami 

0x03A 

drann2 

dram2 

oxoac 

dram3 

dram3 

0x03E 

dram4 

dram4 

0x040 

*0Ut1 

*in1 


Address (Hex) 

ADST 

ASRC 

0x042 

outT+ + 

♦in2 

0x044 

shift-hi 

*stat 

0x046 

outl-hi 

*stat# 

0x048 

*out2 


0x04A 

out2+ + 


0x04C 

shift-r 


0x04E 

out2-hi 


0x050 

outl-c 


0x052 

inl-c 


0x054 

shift-1 


0x056 

. ini -hi 


0x058 

out2-c 


0x05A 

in2-c 


0x05C 



0x05E 

in2-hi 


0x060 

r8 

r8 

0x062 

r9 

r9 

0x064 

rIO 

rIO 

0x066 

r11 

r11 

0x068 

r12 

r12 

0x06A 

r13 

r13 

0x06C 

r14 

r14 

0x06E 

r15 

r15 

0x070 

cc 

shift 

0x072 

fent 

font 

0x074 

^dramS 

*dram3 

0x076 

*dram4 

*dram4 

0x078 

*dram3+ + 

*dram3+ + 

0x07A 

*dram4+ + 

*dram4+ + 

0x07C 

*dram3 — 

*dram3 — 

0x07E 

*dram4 — 

*dram4 — 
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Table 3-14. 82750PB B Bus Source/Destination Address Mapping 


Address (Hex) 

BDST 

BSRC 

0x080 

Null 

Null 

0x082 


alu . 

0x084 

*dram3 

*dram3 

0x086 

*dram4 

*dram4 

0x088 

*dram3+ + 

*dram3+ + 

0x08A 

*dram4+ + 

*dram4+ + 

0x08C 

*dram3 — 

*dram3 — 

0x08E 

*dram4 — 

*dram4 

0x090 

rO 

rO 

0x092 

r1 

r1 

0x094 

r2 

r2 

0x096 

r3 

r3 

0x098 

r4 

r4 

0x09A 

r5 

r5 

0x09C 

r6 

r6 

0x09E 

r7 

r7 

OxOAO 

r8 

*in1 

0x0A2 

r9 

*in2 

0x0A4 

no 

♦stat 

0x0A6 

r11 

♦stat# 

0x0A8 

r12 

circbuf 

OxOAA 

r13 


OxOAC 

r14 


OxOAE 

r15 


OxOBO 

circbuf 

literal 0 

0x0B2 


literal 1 

0x0B4 

*dram1 

literal 2 

0x0B6 

*dram2 

literal 3 

0x0B8 

*dram1 + + 

literal 4 

OxOBA 

*dram2+ + 

literal 5 

OxOBC 

*dram1 — 

literal 6 

OxOBE 

*dram2 — 

literal 7 

OxOCO 

*0Ut1 

prof 


Address (Hex) 

BDST 

BSRC 

0x0C2 

outi + + 


0x0C4 

out1-lo 

out1-lo 

0x0C6 

outi -hi 

outi -hi 

0x0C8 

"out2 

stat-lo 

OxOCA 

out2+ + 

stat-hi 

OxOCC 

out2-lo 

out2-lo 

OxOCE 

out2-hi 

out2-hi 

OxODO 

outl-c 

outl-c 

0x0D2 

in1-c 

inl-c 

0x0D4 

inl-lo 

inl-lo 

0x0D6 

ini -hi 

ini -hi 

0x0D8 

out2-c 

out2-c 

OxODA 

in2-c 

In2-c 

OxODC 

ln2-lo 

in2-lo 

OxODE 

in2-hi 

in2-hi 

OxOEO 

stat-ram 

r8 

0x0E2 

stat-c 

r9 

0x0E4 

stat-lo 

no 

0x0E6 

stat-hi 

r11 

0x0E8 

yeven-lo 

n2 

OxOEA 

yeven-hl 

n3 

OxOEC 

yodd-lo 

r14 

OxOEE 

yodd-hi 

r15 

OxOFO 

ypitch 

shift 

0x0F2 


stat-c 

0x0F4 

vu-lo 

*dram1 

0x0F6 

vu-hi 

*dram2 

0x0F8 

vupitch 

*dram1 + + 

OxOFA 

vpitch 

*dram2+ + 

OxOFC 

vptr-io 

*dram1 — 

OxOFE 

vptr-hi 

*drann2 — 
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Table 3-15. VRAM Pointer RAM Mapping 


Byte Address 

Name 

Description 

0x180 

Yw-lo 

Working Copy of Y Pointer 

0x182 

Yw-hi 


0x184 

outl-lo 

Output FIFO 1 Pointer 

0x186 

outl-hi 


0x188 

Yw-pitch 

Working Copy of Y Pitch 

0x1 8A 


RESERVED 

0x1 8C 

out2-lo 

Output FIFO 2 Pointer 

0x1 8E 

out2-hi 


0x190 

VUw-lo 

Working Copy of VU Pointer 

0x192 

VUw-hi 


0x194 

Inl-lo 

Input FIFO 1 Pointer 

0x196 

Ini -hi 



VUpitchw 

Working Copy of VU Pitch 


vpitchw 




Input FIFO 2 Pointer 




0x1 AO 

vptrw-lo 


0x1 A2 

vptrw-hi 


0x1 A4 

stat-lo 

Working Copy of Statistical Decoder Pointer 

0x1 A6 

stat-hi 


0x1 A8 

Yeven-lo 

Shadow Copy of Y Start Even Pointer 

0x1 AA 

Yeven-hi 


0x1 AC 

Yodd-lo 

Shadow Copy of Y Start Odd Pointer 

0x1 AE 

Yodd-hi 


0x1 BO 

Ypitch 

Shadow Copy of Y Pitch 

0x1 B2 

rfcnt 

RFSH Cycles per RFSH Code from 82750DB 

0x1 B4 

VU-lo 

Shadow Copy of VU Start Pointer 

0x1 B6 

VU-hi 


0x1 B8 

VUpitch 

Shadow Copy of VU Pitch 

0x1 BA 

vpitch 

Shadow Copy of 82750DB Pitch 

0x1 BC 

vptr-lo 

Shadow Copy of 82750DB Pointer 

0x1 BE 

vptr-hl 



NOTE: Register rfont write only register and should never be read. 


Initializing the 82750PB 

The 82750PB is placed in a RESET state by assert- 
ing RESET # for at least ten T-cycles. In the RESET 
state, which continues until RESET # is released, all 
of the 82750PB’s outputs are tri-stated for compati- 
bility with board test requirements. 

Proper initialization of the 82750PB requires that the 
82750PB is held in a RESET state by keeping RE- 
SET# active for at least 10 T-cycles, and then re- 


leasing RESET#. This is referred to as the INITIAL 

state. In the INITIAL state: 

• The microcode processor is halted. 

• All six interrupts are masked, and the interrupt 
latches are cleared. 

• The 82750PA/82750PB instruction format select 
bit is set to the 82750PA. 

• The VRAM interface is ready to service VRAM 
requests; however, none of the VRAM pointers 
are valid. 
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• The number of refresh cycles that will be generat- 
ed each time a RFSH code is received from the 
82750DB is set to 14 cycles. 

• All bidirectional I/O pins are tristated. 

After the 82750PB has been initialized, i.e., placed in 
the INITIAL state, but prior to releasing the 
82750DB’s reset signal, the following operations 
must be performed: 

• Load the REFRESH-CYCLES-PER-LINE register 
with the appropriate value (the equation for the 
value is: VALUE = (2N - 1), where N is the num- 
ber of cycles; for example, 5 refresh cycles would 
result in VALUE = 25-1 = 31io = OOlF-ie- 
The refresh register is 14 bits wide and the way it 
works is to generate one refresh everytime a right 
shift results in a 'V bit. It continues the right sifting 
until it finds a 'O' bit and halts. Hence from program- 
ming point of view: 001 Fie = FFDFie = 5 refresh 
cycles per line. 

• Load the shadow copies of Y, VU, and 82750DB 
pointers and pitches. 

• Load the appropriate 82750DB Register Load list 
into VRAM starting at the address pointed to by 
the 82750DB pointer. 

Prior to releasing the microcode processor from its 
HALTed state to run a microcode program, the fol- 
lowing operations must be performed: 

• If 82750PB code is to be executed, bit 1 5 of the 
82750PB CONTROL register must be set to a 
one. 

• Load a microcode program into microcode RAM 
on the 82750PB by writing to the three instruction 
word registers {mcodel - the most significant 
word of the instruction, mcode2, and 
mcodeS ~ the least significant word of the in- 
struction, the one containing the next address 
field) and then writing to maddr, the address in 
microcode RAM where the instruction will be 
loaded. 

• Load the PC with the address in microcode RAM 
of the first instruction to be executed. 

• Write to the 82750PB CONTROL register with the 
HALT bit (bit 0) set to zero, causing the processor 
to start executing an instruction sequence, or with 
the SINGLE-STEP bit (bit 1) set to a one (keeping 
HALT also set to one), causing the processor to 
execute a single instruction. 


Performance Monitoring 

Two signals, FRZ# and PMON#, which are useful 
for microcode performance monitoring, are available 


both as external signals, multiplexed on a single out- 
put pin, and as bits in the Processor Status register. 
FRZ# is active for each T-cycle when the micro- 
code processor is frozen, waiting for access to 
VRAM or to the VRAM Pointer RAM. PMON# can 
be toggled by a special ALU opcode or a special B 
bus source code. This allows PMON# to be used to 
indicate what particular segment of microcode is be- 
ing execute. The PMON/FRZ bit in the Processor 
Control register selects the signal that is being out- 
put. 

Freezes may indicate that the microcode routine is 
not making the most efficient use of the input and 
output FIFO buffering. This is particularly important 
for the inner loops of graphics and video routines 
that are memory-bandwidth limited. Ideally, inner 
loops should be balanced so that the rate pixels are 
processed is equal to the rate that they can be read 
from and written to VRAM with no freezes. The buff- 
ering in the input and output FIFOs serve to make 
sequential reads and writes to VRAM more efficient 
by performing full 64-bit reads and writes, instead of 
individual 8-bit or 16-blt accesses. This has the ef- 
fect of averaging the VRAM read/write rate over a 
number of instruction times. For example, if the 
82750PB is performing a 64-bit read or write every 8 
T-cycles, for an average of 8 bits per T-cycle, a two 
instruction inner loop could read one 8-bit pixel and 
write one 8-bit pixel without any freezes occurring 
(assuming the source pixels and the destination pix- 
els are each sequential). 

The PMON# provides a more standard performance 
monitoring capability by indicating when a particular 
segment of microcode, bracketed by special instruc- 
tions that toggle the PMON# signal, is being exe- 
cuted. This allows either absolute execution-time 
measurement or measurement of the fraction of the 
total execution time that is required by the segment. 
Either the ALU opcode ‘prof or the B bus source 
code ‘prof will toggle the PMON signal. 

An external HALT pin is provided on the 82750PB to 
allow external debugging hardware to immediately 
halt the microcode processor. Activating this input 
causes the microcode processor to halt prior to exe- 
cuting the next instruction. When the processor is 
halted, the VRAM interface portion of the 82750PB 
continues to operate normally, performing transfer 
cycles, refresh cycles, and shadow copies as re- 
quested by the 82750DB. 


Host/VRAM Timing Diagrams 

Figures 3-4 through 3-8 are Host/VRAM Timing Dia- 
grams. 
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Figure 3-5. VRAM Transfer and Refresh Cycles 
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Figure 3-6. Host Register Read and Write Cycles 
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Figure 3-7. Host External Read and Write Cycles 
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Shaded ar<tas indicate 
bidircciional signal 
driven fay host 
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NOTES: 

1. MREQ#, RFSH# TRNFR#, and NXTFST# remain inactive during Host External Read and Write cycles. 

2. If the Synchronizer on HREQ# is disabled, then the second Ti state will be missing. 
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Figure 3-8. Host VRAM Read and Write Cycles 
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Note: 82750PB will slay in Tb for the maximum of: 

1) one T-stale, OR 

2) two T-states after VALEN# goes low. 


Shaded areas indicate 
bidirectional signal is 
driven by host 
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NOTES: 

1. RFSH#, TRNFR#, and NXTFST# remain inactive during Host VRAM Read and Write cycles. 

2. If the Synchronizers on HREQ#/HALEN# is disabled, then the second Ti state will be missing. 
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4.0 MICROCODE INSTRUCTION 
FORMAT 


Overview 

The 82750PB executes two slightly different instruc- 
tion formats; one that is backward compatible with 
the 82750PA and another that allows full access to 
the microcode resources of the 82750PB. The 
82750PA/82750PB bit in the 82750PB processor 
control register determines which instruction format 
is in effect (see Chapter 3). On reset, the 82750PB is 
placed In 82750PA instruction format mode. In this 
mode the 82750PB will execute binary microcode 
originally assembled for the 82750PA In a manner 
that is functionally equivalent to the 82750PA. 

The following description applies to the 82750PB in- 
struction format. Exact definitions of 82750PB in- 
struction formats and field codings are shown in Fig- 
ure 4-2 Uiiu Tci'uie 4-5. 


Instruction Sequencing 

The instruction word for 82750PB’s microcode proc- 
essor is 48 bits wide. The Microcode RAM holds 51 2 
instructions. Nine bits of each instruction specify the 
address of the next instruction to be executed. Each 
instruction fetch reads two instructions (of odd ad- 
dress and even address pair) using the upper eight 
bits of the 9-bit instruction address. Both the LSB of 
the instruction address and a Condition Flag bit, se- 
lected from eight possible branching conditions, are 
used to determine whether the next instruction to be 
executed is the even address instruction or odd ad- 
dress instruction, according to the logic table shown 
as Table 4-1. 


Table 4-1. Microcode Next Instruction Selection 


LSB of 
Address 

Condition 
Flag State 

Next 

Instruction 

0 

0 (FALSE) 

EVEN 

0 

1 (TRUE) 

EVEN 

1 

0 (FALSE) 

ODD 

1 

1 (TRUE) 

EVEN 


For an unconditional branch, the condition flag 
FALSE (which Is always zero) Is selected; this caus- 
es the LSB of the address to be passed through to 
select the next instruction: LSB = 0 selects EVEN 
and LSB = 1 selects ODD. This allows uncondition- 
al branching to any of the 512 instructions in the 
RAM. For a conditional branch, the LSB of the ad- 
dress Is set to a one; this causes the state of the 
condition flag to select the next Instruction: FALSE 
selects the ODD Instruction and TRUE selects the 
EVEN instruction. Therefore, a conditional branch 
jumps to either the odd or even instruction of an 
odd/even pair depending on the state of the condi- 
tion. 


instruction Word Field Descriptions 

Each field of the microcode instruction format is de- 
scribed In the following sections. 

NADDR— NEXT INSTRUCTION ADDRESS FIELD 

This field holds the address of the next instruction to 
be executed. Taking advantage of the fact that the 
microcode RAM is physically organized as 256 deep 
by 96 wide (two instructions are fetched per read 
cycle), a zero delay two-way branch can be 
achieved. The only case in which this field is not 
used to determine the address of the next instruc- 
tion to be executed is when an instruction writes to 
the PC. (The term PC refers to the register that holds 
the address of the next instruction to be executed.) 
When an instruction loads the PC a one Instruction 
delay occurs before the load takes effect. Therefore, 
the instruction pointed to by the next instruction field 
of the Instruction that loads the PC is executed be- 
fore the jump to the new address occurs. This Is 
shown in Table 4-2. 

There are no restrictions on the instruction following 
a PC load; it will always be executed, even while 
single stepping the processor or if the processor is 
frozen on that instruction. 


CFSEL— CONDITION FLAG SELECT FIELD 

This field selects which condition flag will be used 
with the LSB of NADDR to select the next instruction 
from the odd/even pair. The condition flag assign- 
ment is given in Table 4-3. 


1-102 




82750PB 


iny 


Table 4-2. PC Load Example 


Addr 

Instruction 

NADDR 

Comments 

10 

o 

II 

o 

Q. 

55 

Load PC with zero. 

55 

rO = 1 

X 

This Instruction is executed but its next 
address field is ignored. 

0 

r1 = rO 

25 

PC load takes effect after a one Instructon delay, 
the result is that r1 = rO = 1 . 


Table 4-3. Condition Flag Select Field Assignments 


Value 

Flag 

Description 

000 

FALSE 

Select for Unconditional Branch 

001 

CARRY 

Carry Out from ALU Condition Flag Latch 

010 

OVF 

Overflow from ALU Condition Flag Latch 

oil 

SIGN 

Sign from ALU Condition Flag Latch 

100 

ZERO 

Zero from ALU Condition Flag Latch 

101 

LCNTZ 

TRUE if Selected Loop Counter = 0 

110 

LSB 

LSB of Data Register rO 

111 

MSB 

MSB of Data Register rO 


NOTE: 

The ALU condition flags (CARRY, OVF, SIGN, and ZERO) are latched in the ALU Condition Flag register. This register is 
updated for most — but not all — ALU operations. The remaining flags (LCNTZ, LSB, and MSB) are updated and latched each 
cycle. 


ASRC— A BUS SOURCE SELECT FIELD 

This field selects the element that should drive its 
data onto the A bus during the execution of this in- 
struction. The mapping for this and the following 
three fields is provided in Chapter 6. 

ADST— A BUS DESTINATION SELECT FIELD 

This field selects which element should latch data 
from the A bus during the execution of this Instruc- 
tion. See ASRC above. 


BSRC— B BUS SOURCE SELECT FIELD 

Same as ASRC, but for B bus. See ASRC above. 


BDST— B BUS DESTINATION SELECT FIELD 

Same as ADST, but for B bus. See ADST above. 


CNT— DECREMENT LOOP COUNTER BIT 

A one In this bit position causes the selected Loop 
Counter (selected by LC, the loop counter select bit) 
to be decremented. The new value of the loop coun- 
ter and the updated LCNTZ condition flag are not 
ready until the next Instruction cycle. Therefore, in a 
loop where the loop counter is decremented and 
tested for zero in the same Instruction (typically in a 
one Instruction loop), the start value for the loop 
counter should be one less than the number of times 
the loop should be executed. 

LIT— LITERAL SELECT BIT 

When this bit is a one, the ASRC and CFSEL fields 
are replaced with a 9-bit literal value that is driven as 
a source in the least significant 9 bits of the A bus. In 
this case, the upper 7 bits of the A bus are forced to 
zeros. The mapping of bits from the literal field to the 
A bus is shown in Figure 4-1. 

NOTE 


A conditional branch and a literal on the A bus are 
not allowed in the same instruction. A 3-bit literal 
can be placed on the B bus in any instruction. 
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A bus bits 

15 14 13 12 11 10 9 

8 

7 

6 

5 

4 

3 

2 1 0 

Inst. Word Bits 
ASRC Field 
CFSEL Field 

Forced to Zero 

17 

16 

15 

14 

13 

12 

11 10 9 










Figure 4>1. Literal Field Mapping onto a Bus 


SHFT— SHIFT CONTROL FIELD 

This field controls the bit shifting and byte swapping 
logic associated with register rO . The encoding of 
this field is given in Table 4-4. 


Table 4-4. SHIFT Control Field Coding 


SHFT 

Operation 

00 

No Shift or Swap Operation 

01 

Shift rO Right One Bit 

- 

Position, Sign Extend 

10 

1 1-*:^ 
wi III t IV/ i-^l 1 V/'l 1C? LJIl 

Position, Zero Fill 

11 

Byte Swap the Value 

Being Loaded Into rO"^ 


*Byte swapping only works when rO is the destination on the 
A bus or the B bus. It does not swap data held in rO, only data 
being loaded. In order to byte swap data in register rO, rO 
must be both a source and destination for either the A or B 
bus. 

ALUSS— ALU SOURCE SELECT BITS 

These two bits are used as enables for the two ALU 
input latches. Bit 39 enables the latch that connects 
to the A bus; bit 38 enables the latch connected to 
the B bus. A one in either bit position causes the 
corresponding input latch to latch the value on the 
bus to which it is connected (the A or B bus). A zero 


on either bit causes the corresponding latch to hold 
its current content. This allows the ALU operands 
either to come from “eavesdropping” on the A or B 
bus transfers occurring in the current instruction cy- 
cle or to be held for multiple instruction cycles in 
either the A or B Input latch. 

ALUOP— ALU OPERATION CODE FIELD 

This field specifies the ALU instruction to be per- 
formed during the current instruction cycle. The en- 
coding of this field is given in Figure 4-2. Normally, at 
the end of the instruction execution, the result of the 
ALU operation is latched in the ALU output latch that 
can be a source on either the A or B buses. Howev- 
er, if a NOP is selected for the ALU operation, the 
ALU output latch is not latched. The data Is held 
from the previous instruction. In addition to NOP, 
certain other ALU opcodes do not actually perform 
ALU operations and therefore, do not latch the ALU 
results. They are I NT (microcode interrupt) and the 
PROF Instruction. 


LC— LOOP COUNTER SELECT BIT 

This bit selects which of the two loop counters is to 
be used for decrementing or Loop-Counter-Zero 
conditional branching in the current Instruction. A 
zero selects loop counter zero and a one selects 
loop counter one. 

Refer to the Intel 82750PB Microcode Programming 
Guide for more information on microcode programming. 
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Table 4-5. 82750PB Source/Destination Coding 


Address (Hex) 

BDST 

BSRC 

ADST 

ASRC 

0x0 

Null 

Null 

Null 

Null 

0x1 


alu 


hwid 

0x2 

*dram3 

*dram3 


cc 

0x3 

*dram4 

*dram4 

maddr 


0x4 

*dram3+ + 

*dram3+ + 


alu 

0x5 

*dram4+ + 

*dram4+ + 

cnt 

cnt 

0x6 

*dram3 — 

*dram3 — 

cnt2 

cnt2 

0x7 

*dram4 — 

*dram4 — 

lent 

lent 

0x8 

rO 

rO 

rO 

rO 

0x9 

r1 

r1 

r1 

r1 

OxA 

r2 

r2 

r2 

r2 

Gad 

r5 

r3 

r3 

r3 

OxC 

r4 

r4 

r4 

r4 

OxD 

r5 

r5 

r5 

r5 

OxE 

r6 

r6 

r6 

r6 

OxF 

r7 

r7 

r7 

r7 

0x10 

r8 

♦ini 

mcode3 

mcode3 

0x11 

r9 

*in2 

mcode2 

mcode2 

0x12 

no 

♦stat 

mcodel 

mcodel 

0x13 

r11 

♦stat# 

pc 

pc 

0x14 

r12 

circbuf 

pixint-c 


0x15 

r13 


pixint 

pixint 

0x16 

r14 


♦drami 

♦drami 

0x17 

r15 


*dram2 

♦dram2 

0x18 

circbuf 

literal 0 

♦drami + + 

♦drami + + 

0x19 


literal 1 

♦dram2+ + 

♦dram2+ + 

0x1 A 

*dram1 

literal 2 

♦drami — 

♦drami — 

0x1 B 

*dram2 

literal 3 

♦dram2 — 

♦dram2 — 

0x1 C 

*dram1 + + 

literal 4 

drami' 

drami 

0x1 D 

*dram2+ + 

literal 5 

dram2 


0x1 E 

*dram1 — 

literal 6 

dram3 


0x1 F 

*dram2 — 

literal 7 

dram4 


0x20 

*OUt1 

prof 

♦outi 

♦ini 
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Table 4-5. 82750PB Source/Destination Coding (Continued) 


Address (Hex) 

BDST 

BSRC 

ADST 

ASRC 

0x21 

out1 + + 


out1 + + 

*in2 

0x22 

out1-lo 

outi -lo 

shift-rl 

*stat 

0x23 

outl-hi 

outl-hi 

outl-hi 

♦stat# 

0x24 

"out2 

stat-lo 

♦out2 


0x25 

out2+ + 

stat-hi 

out2+ + 


0x26 


out2-lo 

shift-r 


0x27 



out2-hi 


0x28 


outl-c 



0x29 


inl-c 

inl-c 


0x2A 


inl-lo 

shift-1 


0x2B 


ini -hi 

ini -hi 


0x2C 


out2-c 

out2-c 


0x2D 

in2-c 

in2-c 

in2-c 


0x2E 

in2-lo 

in2-lo 



0x2F 

in2-hi 

in2-hi 

in2-hi 


0x30 

stat-ram 

r8 

r8 

r8 

0x31 

stat-c 

r9 

r9 

r9 

0x32 

stat-lo 

rIO 

no 

no 

0x33 

stat-hi 

r11 

Ml 

ni 

0x34 

yeven-lo 

r12 

r12 

n2 

0x35 

yeven-hi 

r13 

r13 

n3 

0x36 

yodd-lo 

r14 

r14 

n4 

0x37 

yodd-hi 

r15 

r15 

n5 

0x38 

ypitch 

shift 

cc 

shift 

0x39 


stat-c 

fcnt 

fcnt 

0x3A 

vu-lo 

*dram1 

*dram3 


0x3B 

vu-hi 

1 

*dram2 

*dram4 


0x3C 

vupitch 

*dram1 + + 

*dram3+ + 

*dram3+ + 

0x3D 

vpitch 

♦dram2+ + 

*dram4+ + 

*dram4+ + 

0x3E 

vptr-lo 

*dram1 — 

*dram3 — 

*dram3 — 

0x3F 

vptr-hi 

*dram2 

*dram4 — 

*dram4 — 
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Figure 4-2. 82750PB Instruction Word Format (Continued) 
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5.0 ELECTRICAL DATA 
Maximum Ratings 

Table 5-1 is a stress rating only, and functional operation 
at the maximums is not guaranteed. Functional operat- 
ing conditions are given in the DC and AC Characteris- 
tics (Tables 5-2, 5-3, 5-4, and 5-5). 

DC Characteristics 


Table 5-1. Absolute Maximum Requirements 


Condition 

Maximum Requirement 

Case Temperature under Bias 

-65°eto110°C 

Storage Temperature 

-65^C to 150°C 

Voltage on Any Pin with Respect to Ground 

-0.5V to Vcc + 0.5V 

Supply Voltage with Respect to Vgg 

-0.5V to + 6.5V 


Table 5-2. DC Characteristics V^c 5V ±10%, Tqase =0°Cto90X 


Symbol 

Parameter 

Min 

Typ 

Max 

Unit 

Notes 

V,L 

Input LOW Voltage 

-0.3 


0.6 

V 

(Note 1) 

V,H 

Input HIGH Voltage 

2.0 


Mdc 

V 

(Note 1) 

VoL 

Output LOW Voltage 


0,2 

0.4 

V 

Iql = 4.0 mA<^> 

Mdh 

Output HIGH Voltage 

2.4 

3.0 


V 

loH= -1-0 mA’> 

l|L 

Input Leakage Current 

-10 


+10 

tiA 


’oz 

Output Leakage Current 

^10 


+10 

UA 


^cc 

Power Supply Current 


150 

200 

mA 

25 MHz'^' 

^IN 

Input Capacitance 



10.0 

pF 

Fg = 1 MHz<^> 

^OUT 

Output Capacitance 



12.0 

pF 


^CLKIN 

CLKIN Input Capacitance 



20.0 

pF 

Fg = 1 MHz<3> 


NOTES: 

1 . Measured with CLKIN = 8 MHz. 

2. Typical current value measured under typical conditions. Maximum current value guaranteed with 50 pF maximum output 
loading. 

3. Not 100% tested. 


Exposure to Maximum Ratings may affect device re- 
liability. Furthermore, although the 82750PB con- 
tains protective circuitry to resist damage from static 
electrical discharge, always take precautions to 
avoid high static voltages or electric fields. 
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AC Characteristics 


Table 5-3. AC Characteristics at 25 MHz V^c = 5V ± 10%, Tc^se = 0°C to -h 90°C,Cl = 50 pF 


Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 


Frequency 

8 

25 

MHz 


IxClock 


CLKIN Period 

40 

125 

ns 

5-1 


^2 

CLKIN High Time 


26 

ns 

5-1 

(Note 1) 

^3 

CLKIN Low Time 

14 

26 

ns 

5-1 

(Notel) 

‘4 

CLKIN Fall Time 


4 

ns 

5-1 


ts 

CLKIN Rise Time 


4 

ns 

5-1 


^6a 

A[31:2], BE#[3:0],WE#, 

D[31:0], HINT#, PMFRZ # 

Valid Delay 

3 

25 

ns 

5-2 


^6b 

MREQ #, TRNFR #, RFSH #, 
NXTFST #, HBUSEN #, 

HRDY #, Valid Delay 

3 

18 

ns 

5-2 


V 

A[31:2], BE# {3:0], WE#, 

D[31;0] Float Delay 


30 

ns 

5-2 

(Note 2) 

^8 

MRDY # Setup 

10 


ns 

5-3 


^9 

MRDY # Hold 

6 


ns 

5-3 


^10 

HREQ #, VBUS[3;0], RESET #, 
HALEN #, HALT# Setup 

8 


ns 

5-3 


‘11 

HREQ #, VBUS[3:0}, RESET #, 
HALEN #, HALT # Hold 

6 


ns 

5-3 


^12 

A[8:2], BE # [3:0], WE #, 

D[31 :0] Setup 

4 


ns 

5-3 

(Note 3) 

^3 

A[8:2], BE # [3:0], WE #, 

D[31:0] Hold 

6 


ns 

5-3 

(Note 3) 

^14 

HREG #, HRAM # Setup 

10 


ns 

5-3 


‘15 

HREG #, HRAM # Hold 

6 


ns 

5-3 


^16 

CLKOUT Valid Delay 


18 

ns 

5-4 


tl7 

CLKOUT High Time 

1/2t^-a 

1/2ti+6 

ns 

5-4 



NOTES: 

1. This assumes 40 ns period. For other speeds these values should fall between 40% to 60% duty cycle. 

2. Not 1 00% tested. Guaranteed by design characterization. 

3. Inputs must remain valid throughout all cycles of host accesses. See Figures 3-6 through 3-8. 

4. All A.C. specifications are measured at the 1 .5V crossing point with a 50 pF load. 



82750PB 


iny. 






1-112 













82750PB 


Output Delay and Rise Time Versus Load Capacitance 



Figure 5-5. Typical Output Valid Delay Versus Load Capacitance under Worst Case Conditions 



Figure 5-6. Typical Output Rise Time Versus Load Capacitance under Worst Case Conditions 
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6.0 MECHANICAL DATA 
Packaging Outlines and Dimensions 

Intel packages the 82750PB in a Plastic Quad Flat Pack (PQFP). Table 6-1 gives the symbol list for the PQFP. 


Table 6-1. PQFP Symbol List 


Letter or 
Symbol 

Description of Dimensions 

A 

Package Height: Distance from Seating Plane to Highest Point of Body 

Ai 

Standoff: Distance from Seating Plane to Base Plane 

D/E 

Qverall Package Dimension: Lead Tip to Lead Tip 

D1/E1 , 

Plastic Body Dimension 

D2/E2 

Bumper Distance 

D3/E3 

Footprint 

“1 

Pr»/-vt I 

i-w. .y.. . 

N 

Total Number of Leads 


The PQFP has the following specifications: 

1. All dimensions and tolerances conform to ANSI Y14.5M-1982. 

2. Datum plane — H — is located at the mold parting line and coincident with the bottom of the lead where lead 
exits plastic body. 

3. Datums A-B and — D — are to be determined where center leads exit plastic body at datum plane — H — . 

4. Controlling dimension Is the inch. 

5. Dimensions Di, D2, E-i, and E2 are measured at the mold parting line and do not include mold protrusion. 
Allowable mold protrusion is 0.18 mm (0.007 in.) per side. 

6. Pin 1 Identifier Is located within one of the two zones indicated. 

7. Measured at datum plane — H — . 

8. Measured at seating plane datum — C — . 
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Table 6-2 provides outline characteristics for 0.025 in. pitch. 

Table 6-2. Intel Case Outline Drawings for PQFP at 0.025 inch Pitch 
Symbol Description Min Max 



Figure 6-1. Principal Dimensions of the 82750PB in the i32-Lead PQFP Package 
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Figure 6-3. Detailed Dimensions of the 82750PB in the 132-Lead PQFP— Terminal Details 
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1.32 (.052) 
1.22 (.048) 


0.90 (.035) MIN. 


1.32 (.052) 
1.22 (.048) 


0.90 (.035) MIN. 

2.03 (.080) 
1.93 (.076) 


2.03 (.080) 
1.93 (.076) 


Figure 6-4. 132-Lead PQFP Mechanical Package Detail— Protective Bumper 


1. 13 (.005)® |ClA®-B(DiO(D 



0.41 (.016) 
0.20 (.008) 


0.31 (.012) -H K 
0.20 (.008) 

^|0.20 (.gg8)^|c|A®>B(s)rQ(S 

DETAIL J 



0.20 (.008) 
0.14 (.005) 


8 oeo. 

0 OEQ. 


DETAIL L 


Figure 6-5. 132-Lead PQFP Mechanical Package Detail— Typical Lead 
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NOTES I 


ALL 0I^eNSI0^« AK) TOLERANCES C»rORM TO ANSI Y14.5M.1S82 

OATUN PLAIC £EB LOCATED AT T« NOLO PARTING LirC MC 
COINCIDENT fITH TX BOTTON OF THC LEAD tHCRE LEAD EXITS PLASTIC BODY 


OATUNS [O AlO QQ TO 8C OETERHINED »€RE CENTER LEADS EXIT 
PLASTIC BOOT AT OATUN PLATC EH3 


/i\ CONTROLLING DIKNSION, INCH 


OITCNSIONS Dl, 02, El AND E2 ARE fCASUREO AT T« MOLD PARTING LI^C. 

01 AlC El DO NOT INCLUDE AN ALLOWABLE MOLD PROTRUSION OF f.lS m 
(.M7 IN) PER SIDE. 02 AM) E2 DO NOT INCLUDE A TOTAL ALLOWABLE 
NOLO PROTRUSION OF 8.18 Wt (.187 IN) AT MAXINUN PACKAGE SIZE. 

PIN 1 IDENTIFIER IS LOCATED WITHIN OfC OF TIC TWO ZONES INDICATED 
fCASURED AT OATUN PLAJC QQ 
KASUREO AT seating PLATC DATUM 03 

240854-29 
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Package Thermal Specifications 

The 82750PB is specified for operation when Tc 
(the case temperature) is within the range of 0°C to 
QO^C. Tc may be measured in any environment to 
determine whether the 82750PB is within specified 
operation range. The case temperature should be 
measured at the center of the top surface. 


Ta (the ambient temperature) can be calculated 
from Oca (thermal resistance from case to ambient) 
with the following equation: 

Ta = Tc - P * 0CA 

Typical values for ^CA various airflows are given 
in Table 6-3 for the 132-lead PQFP package. Table 
6-4 shows the maximum Ta allowable (wlhout ex- 
ceeding Tc) at various airflows. The power dissipa- 
tion (P) is calculated by using the typical supply cur- 
rent at 5V as shown in Table 5-2. 


Table 6-3. Thermal Resistance (X/W) 



8cA Versus Airflow — ft/min (m/sec) 

Package 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

132-Lead 

PQFP 

26.0 

17.5 

14.0 

11.5 

9.5 

8.5 


Table 6-4. Maximum Ta at Various Airflows (X) 



Ta Versus Airflow— ft/mln (m/sec) 

Package 

Frequency 

(MHz) 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

132-Lead 

PQFP 

25 

70 

76 

80 

81 

83 

84 
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i 860 ™ Microprocessor Family O 






i 860 TM XP MICROPROCESSOR 


■ Parallel Architecture that Supports Up 
to Three Operations per Clock 

— One Integer or Control Instruction 
— Up to Two Floating-Point Results 

■ High Performance Design 

— 40/50 MHz Clock Rate 

— 100 Peak Single Precision MFLOPS 

— 75 Peak Double Precision MFLOPS 

— 64-Bit External Data Bus 

— 64-Bit Internal Code Bus 

— 128-Bit Internal Data Bus 

■ High Integration on One Chip 

— 32-Bit Integer and Control Unit 

— 32/64-Bit Pipelined Floating-Point 

— 64-Bit 3-D Graphics Unit 

— Paging Unit with 64 Four-Kbyte and 
16 Four-Mbyte Pages 
— 16 Kbyte Code Cache 
— 16 Kbyte Data Cache 

■ Fast, Multiprocessor-Oriented Bus 
— Burst Cycles Move 400 Mbyte/Sec 
— Hardware Cache Snooping 

— MESI Cache Consistency Protocol 
— Supports Second-Level Cache 
— Supports DRAM 


■ Compatible with Industry Standards 
— ANSI/IEEE Standard 754-1985 for 

Binary Floating-Point Arithmetic 
— Intel 386TM/intel 486 TM/i 860 TM Data 
Formats and Page Table Entries 
— Binary Compatible with i860TM XR 
Applications Instruction Set 
— Detached Concurrency Control Unit 
(CCU) Supports Parallel Architecture 
Extensions (PAX) 

— JEDEC 262-pin Ceramic Pin Grid 
Array Package 

— IEEE Standard 1149.1/D6 Boundary- 
Scan Architecture 

■ Easy to Use 

— On-Chip Debug Register 
— UNIXV860 

— APX Attached Processor Executive 
— Assembler, Linker, Simulator, 
Debugger, C and FORTRAN 
Compilers, FORTRAN Vectorizer, 
Scalar and Vector Math Libraries 
— Graphics Libraries 


The Intel i860 XP Microprocessor (order code A80860XP) delivers supercomputing performance in a single 
VLSI component. The 32/64-bit architecture of the i860 XP microprocessor balances integer, floating point, 
and graphics performance for applications such as engineering workstations, scientific computing, 3-D graph- 
ics workstations, and multiuser systems. Its parallel architecture achieves high throughput with RISC design 
techniques, multiprocessor support, pipelined processing units, wide data paths, large on-chip caches, 2.5 
million transistor design, and fast 0.8-micron silicon technology. 


A31-A3 D63-D0 CONTROL 
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Figure 0.1. Block Diagram 

*UNIX Is a registered trademark of UNIX System Laboratories, Inc. 

Intel, i860, (ntel386 and Intel486 are trademarks of Intel Corporation. 
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1,0 FUNCTIONAL DESCRIPTION 

As shown by the block diagram on the front page, 
the i860 XP Microprocessor consists of the following 
units: 

1. Integer Registers and Core Execution Unit 

2. Floating-Point Registers and Control Unit 

3. Floating-Point Adder Unit 

4. Floating-Point Multiplier Unit 
• 5. Graphics Unit 

6. Paging Unit 

7. Instruction Cache 

8. Data Cache 

9. Bus and Cache Control Unit 

10. Detached Concurrency Control Unit 

The core execution unit controls overall operation of 
the i860 XP microprocessor. It executes load, store, 
integer, bit, I/O, and control-transfer operations, and 
fetches instructions for the floating-point unit as well. 
A set of 32 X 32-bit general-purpose registers are 
provided for the manipulation of integer data. Load 
and store instructions move 8-, 1 6-, and 32-bit data 
to and from these registers. Its full set of integer, 
logical, and control-transfer instructions give the 
core unit the ability to execute complete systems 
software and applications programs. A trap mecha- 
nism provides rapid response to exceptions and ex- 
ternal Interrupts. Debugging is supported by the abili- 
ty to trap on data or Instruction reference. 

The floating-point hardware is connected to a sepa- 
rate set of floating-point registers, which can be ac- 
cessed as 16 X 64-bit registers or as 32 x 32-bit 
registers. Load and store instructions can also ac- 
cess these same registers as 8 x 1 28-blt registers. 
All floating-point and graphics instructions use these 
registers as their source and destination operands. 


The floating-point multiplier performs floating-point 
and integer multiply as well as floating-point recipro- 
cal operations on 64- and 32-bit floating-point val- 
ues. A multiplier Instruction executes in three to four 
clocks; however, in pipelined mode, a new result can 
be generated every clock for single-precision and 
every other clock for double precision. 


The graphics unit supports three-dimensional draw- 
ing in a graphics frame buffer, with color intensity 
shading and hidden surface elimination via the 
Z-buffer algorithm. The graphics unit recognizes the 
pixel as an 8-, 1 6-, or 32-bit integer data type. It can 
compute individual red, blue, and green color Inten- 
sity values within a pixel; but it does so with parallel 
operations that take advantage of the 64-blt Internal 
word size and 64-bit external bus. The graphics fea- 
tures of the i860 XP microprocessor assume that the 
surface of a solid object Is drawn with polygon 
patches which, like the pieces of a puzzle, collec- 
tively approximate the shape of the original object. 
The color intensities of the vertices of the polygon 
and their distances from the viewer are known, but 
the distances and Intensities of the other points 
must be calculated by interpolation. The graphics in- 
structions of the I860 XP microprocessor directly aid 
such interpolation. 



The paging unit implements protected, paged, virtual 
memory. The paging unit uses two four-way set-as- 
sociative cache memories called TLBs (Translation 
Lookaside Buffers) to perform the translation of logi- 
cal address to physical address, and to check for 
access violations. The access protection scheme 
employs two levels of privilege; user and supervisor. 
One TLB supports 4 Kbyte pages, and has 64 en- 
tries; the other supports 4 Mbyte pages, and has 1 6 
entries. 


The instruction cache is a four-way set-associative 
memory of 1 6 Kbytes, with 32-byte lines. It transfers 
up to 64 bits per clock (400 Mbyte/sec at 50 MHz). 


The floating-point control unit controls both the float- 
ing-point adder and the floating-point multiplier, issu- 
ing Instructions, handling all source and result ex- 
ceptions, and updating status bits In the floating- 
point status register. The adder and multiplier can 
operate in parallel, producing up to two results per 
clock. The floating-point data types, floating-point In- 
structions, and exception handling all support the 
IEEE Standard for Binary Floating-Point Arithmetic 
(ANSI/IEEE Std 754-1985). 

The floating-point adder performs addition, subtrac- 
tion, comparison, and conversions on 64- and 32-blt 
floating-point values. An adder instruction executes 
In three clocks; however, in pipelined mode, a new 
result Is generated every clock. 


The data cache is a four-way set-associative memo- 
ry of 16 Kbytes, with 32-byte lines. It transfers up to 
128 bits per clock (800 Mbyte/sec at 50 MHz). The 
i860 XP microprocessor normally uses write-back 
caching, I.e. memory writes update the cache (if ap- 
plicable) without necessarily updating memory Im- 
mediately; however, under both software and hard- 
ware control, write-through and write-once policies 
can be implemented, or caching can be inhibited. 
The caches are transparent to applications soft- 
ware. 

The bus and cache control unit performs data and 
Instruction accesses for the core unit. It receives cy- 
cle requests and specifications from the core unit, 
performs the data-cache or instruction-cache miss 
processing, controls TLB translation, and provides 
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the interface to the external bus. Its pipelined struc- 
ture supports up to three outstanding bus cycles. Its 
burst mode transfers data at up to 400 Mbyte/sec at 
50 MHz. In multiprocessor systems, it maintains 
cache consistency by monitoring bus activity in par- 
allel with other CPU functions. 

The DCCU (detached concurrency control unit) is a 
compatible subset of the external CCU that expe- 
dites loop-level parallelism and synchronization in 
multiprocessor systems. The DCCU consists of reg- 
isters and a counter that allow a single i860 XP mi- 
croprocessor to run binary code compiled for a mul- 
tiprocessor system adhering to the PAX parallel ap- 
plications binary interface (ABI). 

The i860 XP microprocessor may to be used with or 
without an external, secondary cache built from 
82495XP and 82490XP cache components. An 
82495XP and 82490XP cache provides up to 512 
Kbytes of high-speed storage for data and instruc- 
tion comuineu. Ill riiubi cases, an 624yoXh' and 
82490XP cache can provide data to the CPU with 
zero wait states. The larger size of an external cache 
can provide an increased hit rate when the size or 
number of data structures and programs exceeds 
the size of the Internal caches. In multiprocessor 
systems, the external cache serves as local memo- 
ry, and can reduce bus traffic. An external cache 
also hides the processor from rest of system, which 
is a double advantage: 

1 . The processor can be upgraded without affecting 
design of the memory and other subsystems. 

2. Slower and less expensive memory and I/O sub- 
system designs can be employed without unduly 
lowering overall system performance. 

Refer to the d2495XP Cache Controtter/82490XP 
Cache RAM Data Sheet (Intel Order #240956) for 
more information. 


2.0 PROGRAMMING INTERFACE 

The programmer-visible aspects of the architecture 
of the i860 XP microprocessor include data types, 
registers, instructions, and traps. 


2.1 Data Types 

The i860 XP microprocessor provides operations for 
Integer and floating-point data. Integer operations 
are performed on 32-blt operands with some support 
also for 64-bit operands. Load and store instructions 
can reference 8-bit, 1 6-bit, 32-bit, 64-bit, and 1 28-bit 
operands. Floating-point operations are performed 
on IEEE-standard 32- and 64-bit formats. Graphics 
instructions operate on arrays of 8-, 16-, or 32-bit 
pixels. 


2.1.1 INTEGER 

An integer is a 32-bit signed value in standard two’s 
complement form. A 32-bit integer can represent a 
value in the range -2,147,483,648 (-231) to 
2,147,483,647 ( + 231 - 1). Arithmetic operations on 
8- and 1 6-bit Integers can be performed by sign-ex- 
tending the 8- or 16-bit values to 32 bits, then using 
the 32-blt operations. 

There are also add and subtract instructions that op- 
erate on 64-bit long integers. 

Load and store Instructions may also reference (In 
addition to the 32- and 64-bit formats previously 
mentioned) 8- and 16-bit items in memory. When an 
8- or 1 6-bit item is loaded into a register, it is con- 
verted to an integer by sign-extending the value to 
32 bits. When an 8- or 16-bit item is stored from a 
register, the corresponding number of low-order bits 
of the register are used. 

2.1.2 ORDINAL 

Arithmetic operations are available for 32-bit ordi- 
nals. An ordinal is an unsigned integer. An ordinal 
can represent values in the range 0 to 
4,294,967,295 ( + 232 - 1). 

Also, there are add and subtract instructions that op- 
erate on 64-bit ordinals. 


2.1.3 SINGLE- AND DOUBLE-PRECISION REAL 

Figure 2.1 shows the real number formats. A single- 
precision real (also called “single real”) data type is 
a 32-blt binary floating-point number. Bit 31 is the 
sign bit; bits 30..23 are the exponent; and bits 22..0 
are the fraction. In accordance with ANSI/IEEE 
standard 754, the value of a single-precision real is 
defined as follows: 

1 . If e = 0 and f ^ 0 or e = 255 then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 

2. If 0 < e ^ 255, then the value is (-l)s x l.f x 
20-127. 

3. If e = 0 and f = 0, then the value is signed zero. 

A double-precision real (also called “double real”) 
data type is a 64-bit binary floating-point number. Bit 
63 is the sign bit; bits 62.. 52 are the exponent; and 
bits 51. .0 are the fraction. In accordance with ANSI/ 
IEEE standard 754, the value of a double-precision 
real is defined as follows: 

1 . If e = 0 and f 0 or e = 2047, then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 
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2. If 0 < e < 2047, then the value is (-1)s x 1.f x 
26 - 1023 . 

3. If e = 0 and f = 0, then the value is signed zero. 

The special values infinity, NaN (“Not a Number”), 
indefinite, and denormal generate a trap when en- 
countered. The trap handler implements IEEE-stan- 
dard results. 


less of the pixel size, the i860 XP microprocessor 
always operates on 64 bits of pixel data at a time. 
The pixel data type is used by two kinds of instruc- 
tions: 

• The selective pixel-store instruction that helps im- 
plement hidden surface elimination. 

• The pixel add instruction that helps implement 
3-D color Intensity shading. 


A double real value occupies an even/odd pair of 
floating-point registers. Bits 31. .0 are stored in the 
even-numbered floating-point register; bits 63. .32 
are stored in the next higher odd-numbered floating- 
point register. 


2.1.4 PIXEL 

A pixel may be 8-, 16-, or 32-blts long, depending on 
color and intensity resolution requirements. Regard- 


To perform color intensity shading efficiently In a va- 
riety of applications, the i860 XP microprocessor de- 
fines three pixel formats according to Table 2.1. 


Figure 2.2 illustrates one way of assigning meaning 
to the fields of pixels. These assignments are for 
illustration purposes only. The I860 XP microproces- 
sor defines only the field sizes, not the specific use 
of each field. Other ways of using the fields of pixels 
are possible. 



Single-Precision Real 



Double-Precision Real 


t63J62 52151 OA 

S 

e 

f 

\ 

\ 

1 \ 


t FRACTION 

SIGN 


240874-3 


Figure 2.1. Real Number Formats 


Table 2.1. Pixel Formats 


Pixel 

Size 
(in bits) 

Bits of 

Color 1 
Intensity(i) 

Bits of 

Color 2 
Intenslty(i) 

Bits of 

Color 3 
Intensity(i) 

Bits of 

Other 

Attribute 
(Texture, Color) 

8 

N (^8) bits of intensity(2) 

8-N 

16M 

6 

6 

4 

0 

32 

8 


8 

8 


NOTES: 

1 . The intensity attribute fields may be assigned to colors in any order convenient to the application. 

2. With 8-bit pixels, up to 8 bits can be used for intensity; the remaining bits can be used for any other attribute, such as 
color or texture. Bits that require interpolation (shading), such as those for intensity, must be the low-order bits of the pixel. 
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NOTE; 

These assignments of specific meanings to the fields of pixels are for illustration only. Only the field sizes are defined, 
not the specific use of each field. 


Figure 2.2. Pixel Format Example 


2.2 Register Set 

As Figure 2.3 shows, the i860 XP microprocessor 
has the following registers: 

• An integer register file 

• A floating-point register file 

• Control registers psr, epsr, db, dirbase, fir, fsr, 
bear, ccr, p3, p2, pi, pO 

• Special-purpose registers KR, Kl, T, MERGE, 
STAT, and NEWCURR 

The control registers are accessible only by load 
and store control-register instructions; the integer 
and floating-point registers are accessed by arithme- 
tic operations and load and store instructions. The 
special-purpose registers KR, Kl, and T are used by 
floating-point instructions; MERGE is used by graph- 
ics instructions. NEWCURR and STAT are used for 
concurrency control; they are accessed by memory 
load and store instructions. 


2.2.1 INTEGER REGISTER FILE 

There are 32 integer registers, each 32 bits wide, 
referred to as rO through r31, which are used for 
address computation and scalar integer computa- 
tions. Register rO always returns zero when read. 

2.2.2 FLOATING-POINT REGISTER FILE 

There are 32 floating-point registers, each 32-bits 
wide, referred to as fO through f31, which are used 
for floating-point computations. Registers fO and f1 
always return zero when read. The floating-point 
registers are also used by a set of integer opera- 
tions, primarily for graphics computations. 


When accessing 64-bit floating-point or intonor vau 
ues, the i860 XP microprocessor uses an even/odd 
pair of registers. When accessing 128-bit values, it 
uses an aligned set of four registers (fO, f4, 18^ f12, 
f16, f20, f24, or f28). The instruction must designate 
the lowest register number of the set of registers 
containing 64- or 1 28-bit values. Misaligned register 
numbers produce undefined results. The register 
with the lowest number contains the least significant 
part of the value. For 1 28-bit values, the register pair 
with the lower number contains the value from the 
lower memory address; the register pair with the 
higher number contains the value from the higher 
address. 

The 128-bit load and store Instructions, along with 
the 1 28-bit data path between the floating-point reg- 
isters and the data cache, help to sustain an extraor- 
dinarily high rate of computation. 

2.2.3 PROCESSOR STATUS REGISTER 

The processor status register (psr) contains miscel- 
laneous state Information for the current process. 
Figure 2.4 shows the format of the psr. 

• BR (Break Read) and BW (Break Write) enable a 
data access trap when the operand address 
matches the address in the db register and a 
read or write (respectively) occurs. 

• Various instructions set CC (Condition Code) ac- 
cording to tests they perform. The branch-on- 
condition-code instructions test Its value. The bla 
Instruction sets and tests LCC (Loop Condition 
Code). 

• IM (Interrupt Mode), if set, enables external inter- 
rupts on the I NT pin; disables interrupts on I NT if 
clear. IM does not affect parity error interrupts or 
interrupts on the BERR pin. 
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Figure 2.3. Registers and Data Paths 



• U (User Mode) is set when the i860 XP micro- 
processor is executing in user mode; it is clear 
when the i860 XP microprocessor is executing in 
supervisor mode. In user mode, writes to some 
control registers are inhibited. This bit also con- 
trols the memory protection mechanism. 

• PIM (Previous Interrupt Mode) and PU (Previous 
User Mode) save the corresponding status bits 
(IM and U) on a trap, because those status bits 
are changed when a trap occurs. They are re- 
stored Into their corresponding status bits when 
returning from a trap handler with a branch indi- 
rect instruction when a trap flag is set in the psr. 


FT (Floating-Point Trap), DAT (Data Access 
Trap), I AT (Instruction Access Trap), IN (Inter- 
rupt), and IT (Instruction Trap) are trap flags. 
They are set when the corresponding trap condi- 
tion occurs. IN is set on I NT, bus error and parity 
error. The trap handler examines these bits (and 
other trap bits in the epsr) to determine which 
condition or conditions have caused the trap. 

DS (Delayed Switch) Is set If a trap occurs during 
the instruction before dual-instruction mode Is en- 
tered or exited. If DS is set and DIM (Dual Instruc- 
tion Mode) is clear, the i860 XP microprocessor 
switches to dual-instruction mode one instruction 
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Figure 2.4. Processor Status Register 


2.2.4 EXTENDED PROCESSOR STATUS 
REGISTER 

The extended processor status register (epsr) con- 
tains additional state information for the current pro- 
cess beyond that stored in the psr. Figure 2.5 shows 
the format of the epsr. 

• The processor type is 2 for the i860 XP micro- 
processor. 

• The stepping number has a unique value that dis- 
tinguishes among different revisions of the proc- 
essor. 

• IL (Interlock) Is set If a trap occurs after a lock 
Instruction but before the last BRDY# of the load 
or store following the subsequent unlock 
instruction. IL indicates to the trap handler that a 
locked sequence has been interrupted. When the 
trap handler finds IL set, It should scan back- 
wards for the lock instruction and restart at that 
point. The absence of a lock instruction within 
30-33 instructions of the trap indicates a pro- 
gramming error. 


Table 2.2. Values of PS 


Value 

Pixel Size 
in Bits 

Pixel Size 
in Bytes 

00 

8 

1 

01 

16 

2 

10 

32 

4 

11 

(undefined) 

(undefined) 


after returning from the trap handler. If DS and DIM 
are both set, the I860 XP microprocessor switches 
to single-instruction mode one instruction after re- 
turning from the trap handler. 

• When a trap occurs, the i860 XP microprocessor 
sets DIM if it Is executing in dual-instruction 
mode; it clears DIM if it is executing in single-in- 
struction mode. If DIM is set after returning from a 
trap handler, the i860 XP rnicroprocessor re- 
sumes execution In dual-instruction mode. 

• When KNF (Kill Next Floating-Point Instruction) is 
set, the next floating-point instruction is sup- 
pressed (except that its dual-instruction mode bit 
is interpreted). A trap handler sets KNF If the 
trapped floating-point Instruction should not be 
reexecuted. 

• SC (Shift Count) stores the shift count used by 
the last right-shift instruction. It controls the num- 
ber of shifts executed by the double-shift Instruc- 
tion. 

• PS (Pixel Size) and PM (Pixel Mask) are used by 
the pixel-store and other graphics Instructions. 
The values of PS control pixel size as defined by 
Table 2.2. The bits in PM correspond to pixels to 
be updated by the pixel-store instruction pst.d. 
The low-order bit of PM corresponds to the low- 
order pixel of the 64-blt source operand of pst.d. 
The number of low-order bits of PM that are actu- 
ally used is the number of pixels that fit into 
64-bits, which depends upon PS. If a bit of PM is 
set, then pst.d stores the corresponding pixel. 
Refer also to the pst.d instruction in section 10. 
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Figure 2.5. Extended Processor Status Register 


WP (write protect) controls the semantics of the 
W bit of page table entries. A clear W bit in either 
the directory or the page table entry causes 
writes to be trapped. When WP is clear, writes 
are trapped in user mode, but not in supervisor 
mode. When WP is set, writes are trapped in both 
user and supervisor modes. 

PEF (parity error flag) is set by the i860 XP micro- 
processor when a parity error trap occurs. As 
soon as PEF is set, further parity error and bus 
error traps are masked. Software must clear PEF 
to reenable such traps. PEF is set at RESET. 

BEF (bus error flag) is set by the i860 XP micro- 
processor when the BERR pin is asserted. Indi- 
cating a bus error. As soon as BEF is set, further 
parity error and bus error traps are masked. Soft- 
ware must clear BEF to reenable such traps. BEF 
Is set at RESET. 

INT (Interrupt) is the value of the INT input pin. 

DCS (Data Cache Size) is a read-only field that 
tells the size of the on-chip data cache. The num- 
ber of bytes actually available is 212 + DCS; 
therefore, a value of zero indicates 4 Kbytes, one 
indicates 8 Kbytes, etc. The value of DCS for the 
i860 XP microprocessor is two, which indicates 
16 Kbytes. 

PBM (Page-Table Bit Mode) has no effect in 
the i860 XP microprocessor. PBM is used by the 
i860 XR microprocessor. 

BE (Big Endian) controls the ordering of bytes 
within a data item in memory. Normally (i.e. when 
BE is clear) the i860 XP microprocessor operates 
in little endian mode, in which the addressed byte 
is the low-order byte. When BE is set (big endian 


mode), the low-order three bits of all 32-bit data- 
load and store addresses are complemented, 
then masked to the appropriate boundary for 
alignment. This causes the addressed byte to be 
the most significant byte: Big endian mode af- 
fects not only the memory load and store Instruc- 
tions but also the Idio, stio, Idint, and scyc 
instructions. 

• OF (Overflow Flag) Is set by adds, addu, subs, 
and subu when integer overflow occurs. For 
adds and subs, OF is set If the carry from bit 31 
Is different than the carry from bit 30. For addu, 
OF Is set if there Is a carry from bit 31 . For subu, 
OF is set if there Is no carry from bit 31. Under all 
other conditions. It is cleared by these instruc- 
tions. OF may be changed by arithmetic Instruc- 
tions in either user or supervisor mode. It may be 
changed by the st.c Instruction In supervisor 
mode only. OF controls the function of the intovr 
Instruction. Inside the trap handler, OF may not 
be valid for traps other than one caused by 
intovr. 

• BS (bus or parity error trap in supervisor mode) is 
set by the i860 XP nriicroprocessor when a bus or 
parity error occurs during a supervisor mode 
memory access cycle. This is true even though 
the processor may have switched to user mode 
by the time these errors are reported. The BS bit 
contains valid information only if BERR is assert- 
ed in the same clock as BRDY# or one clock 
after that. In all other conditions the contents of 
the BS bit are undefined. The operating system 
can use this bit to decide, for example, whether 
to abort the process (user mode) or reboot the 
system (supervisor mode). 
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• Dl (trap on delayed instruction) is set by the 
i860 XP microprocessor when a trap occurs on a 
delayed instruction (the instruction located after a 
delayed branch instruction). When Dl is set, the 
trap handler must restart the interrupted proce- 
dure from the branch instruction rather than at 
the address in fir. 

• TAI (trap on autoincrement instruction) is set by 
the i860 XP microprocessor when a trap occurs 
on an instruction with autoincrement. When TAI Is 
set, the trap handler should undo the autoincre- 
ment (that is, restore src2 to its original value). 

• PT (trap on pipeline use) indicates to the i860 XP 
microprocessor that a trap should be generated 
and PI should be set when it executes an instruc- 
tion that uses the floating-point or graphics unit. 
Such instructions include all the instructions des- 
ignated “Floating-Point Unit” in Table 2.9, plus 
the pfid instruction. PT Is set and cleared only by 
software. It can be used by the trap handler to 
avoid unnecessary saving aiiu lesiOinig ui lnu 
pipelines (refer to section 2.8). When a trap due 
to PT occurs, the floating-point operation has not 
started, and the pipelines have not been ad- 
vanced. Such a trap also sets the IT bit of psr. 

• The behavior of PI (pipeline instruction) depends 
on the setting of PT. If PT = 0, the i860 XP mi- 
croprocessor sets PI when any pipelined Instruc- 
tion dr pfId is executed. If PT = 1 , the processor 
sets PI and traps when it decodes any instruction 
that uses the pipes, whether scalar or pipelined. 
PI may be set even if KNF is set and the next 
floating point instruction Is suppressed. Refer to 
section 2.8. 

• SO (strong ordering) Indicates whether the proc- 
essor is in strong ordering mode (SO = 1) or weak 
ordering mode (SO = 0). SO is set if the EWBE# 
pin is active (LOW) at RESET. (Refer to the para- 
graphs on write cycle reordering in section 5.) 


2.2.5 DATA BREAKPOINT REGISTER 

The data breakpoint register (db) is used to gener- 
ate a trap when the i860 XP microprocessor access- 
es an operand at the virtual address stored In this 
register. The trap is enabled by BR and BW in psr. 
When comparing, a number of low order bits of the, 
address are ignored, depending on the size of the 
operand. For example, a 16-blt access ignores the 
low-order bit of the address when comparing to db; 
a 32-blt access Ignores the low-order two bits. This 
ensures that any access that overlaps the address 
contained In the register will generate a trap. The 
trap occurs before the register or memory update by 
the load or store instruction. 


2.2.6 DIRECTORY BASE REGISTER 

The directory base register dirbase (shown In Figure 

2.6) controls address translation, caching, and bus 

io» 

• ATE (Address Translation Enable), when set, en- 
ables the virtual-address translation algorithm. 

• DPS (DRAM Page Size) controls how many bits 
to ignore when comparing the current bus-cycle 
address with the previous bus-cycle address to 
generate the NENE# signal. This feature allows 
for higher speeds when using static column or 
page-mode DRAMs and consecutive reads and 
writes access the same column or page. The 
comparison ignores the low-order 12 + DPS bits. 
A value of zero is appropriate for one bank of 
256K X n RAMs, 1 for 1 M X a 7 RAMS, etc. For 
interleaved memory. Increase DPS by one for 
each power of interleaving — add one for 2-way, 
two for 4-way, etc. 
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Figure 2.6. Directory Base Register 
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• When BL (Bus Lock) is set, external bus access- 
es are locked. The LOCK# signal is asserted 
with the next bus cycle (excluding instruction 
fetch and write-back cycles) whose internal bus 
request is generated after BL is set. It remains set 
on every subsequent bus cycle as long as BL re- 
mains set. The LOCK# signal is deasserted on 
the next load or store instruction after BL is 
cleared. Traps Immediately clear BL. The lock 
and unlock instructions control the BL bit. The 
result of modifying BL with the st.c instruction is 
not defined. 

• ITI (Cache and TLB Invalidate), when set In the 
value that is loaded into dirbase, causes all en- 
tries in the instruction cache and virtual tags in 
the address-translation cache (TLB) to be invali- 
dated. Also invalidates all virtual tags in the data 
cache. The ITI bit does not remain set in dirbase. 
ITI always appears as zero when reading 
dirbase. 

• When software sets the LB bit, the i860 XP micro- 
processor enters two-clock late back-off mode. 
This mode gives two additional clock periods of 
decision time to the external logic that may need 
to use the BOFF# signal to cancel a bus cycle or 
data transfer. If the processor enters one-clock 
late back-off mode during RESET via configura- 
tion pin strapping, the LB bit has no effect, and it 
is impossible to enter two-clock late back-off 
mode. Furthermore, software cannot exit two- 
clock late back-off mode once it is activated; the 
LB bit cannot be cleared except by resetting the 
processor. 

• When CSS (Code Size 8-Bit) is set, instruction 
cache misses are processed as 8-bit bus cycles. 
When this bit is clear, instruction cache misses 
are processed as 64-blt bus cycles. This bit can 
not be set by software; hardware sets this bit at 
Initialization time. It can be cleared by software 
(one time only) to allow the system to execute out 
of 64-blt memory after bootstrapping from 8-blt 
EPROM. A nondelayed branch to code in 64-bit 
memory should directly follow the st.c (store con- 
trol register) instruction that clears CS8, in order 
to make the transition from 8-bit to 64-bit memory 
occur at the correct time. The branch instruction 
must be aligned on a 64-bit boundary. 

• RB (Replacement Block) identifies the cache line 
(block) to be replaced by cache replacement al- 
gorithms. RB conditions the cache flush instruc- 
tion flush, which Is discussed In Section 10. Ta- 
ble 2.3 explains the values of RB. 

• RC (Replacement Control) controls cache re- 
placement algorithms. Table 2.4 explains the sig- 
nificance of the values of RC. 


• DTB (Directory Table Base) contains the high-or- 
der 20 bits of the physical address of the page 
directory when address translation is enabled (i.e. 
ATE = 1). The low-order 12 bits of the address 
are zeros. 


Table 2.3. Values of RB 


Value 

Replace 
TLB Block 

Replace Instruction 
and Data Cache Block 

00 

0 

0 

01 

1 

1 

1 0 

2 

2 

1 1 

3 

3 


Table 2.4. Values of RC 


Value 

Meaning 

00 

Selects the normal (random) 
replacement algorithm where any block 
in the set may be replaced on cache 
misses in all caches. 

01 

Instruction, data, and TLB cache misses 
replace the block selected by RB. This 
mode Is used for cache and TLB testing. 

10 

Data cache misses replace the block 
selected by RB. Instruction and TLB 
caches use random replacement. This 
mode Is used when flushing the data 
cache with the flush Instruction. 

11 

Disables data and TLB caches 
replacement. Instruction cache uses 
random replacement. 


2.2.7 FAULT INSTRUCTION REGISTER 

When a trap occurs, this register contains the ad- 
dress of the trapping instruction (not necessarily the 
instruction that created the conditions that required 
the trap). The fir Is a read-only register. In single-in- 
struction mode, using a Id.c instruction to read the 
fir anytime except the first time after a trap saves in 
idest the address of the Id.c Instruction; in dual-in- 
struction mode, the address of its floating-point com- 
panion (address of the Id.c - 4) is saved. 

2.2.8 FLOATING-POINT STATUS REGISTER 

The floating-point status register (fsr) contains the 
floating-point trap and rounding-mode status for the 
current process. Figure 2.7 shows its format. 
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• If FZ (Flush Zero) is clear and underflow occurs, 
a result-exception trap is generated. When FZ is 
set and underflow occurs, the result is set to zero, 
and no trap due to underflow occurs. 

• If Tl (Trap Inexact) Is clear, Inexact results do not 
cause a trap. If Tl is set, inexact results cause a 
trap. The sticky inexact flag (SI) is set whenever 
an inexact result is produced, regardless of the 
setting of Tl. 

• RM (Rounding Mode) specifies one of the four 
rounding modes defined by the IEEE standard. 
Given a true result b that cannot be represented 
by the target data type, the i860 XP microproces- 
sor determines the two representable numbers a 
and c that most closely bracket b in value (a < 
b < c). The i860 XP microprocessor then rounds 
(changes) b to a or c according to the mode se- 
lected by RM as defined In Table 2.5. Rounding 
introduces an error in the result that is less than 
one least-significant bit. 


Table 2.5. Values of RM 


Value 

Rounding Mode 

Rounding Action 

00 

Round to 
nearest or even 

Closer to of a or c; 
if equally close, 
select even number 
(the one whose 
least significant bit 

Is zero). 

01 

Round down 
(toward -oo) 

a 

10 

Round up 
(toward + «») 

c 

11 

Chop 

(toward zero) 

Smaller in 

magnitude of a or c. 
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Figure 2.7. Floating-Point Status Register 
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• The U-bit (Update Bit), if set in the value that is 
loaded into fsr by a st.c instruction, enables up- 
dating of the result-status bits (AE, AA, Al, AO, 
AU, MA, Ml, MO, and MU) in the first-stage of the 
floating-point adder and multiplier pipelines. If this 
bit is clear, the result-status bits are unaffected 
by a st.c instruction; st.c ignores the correspond- 
ing bits in the value that Is being loaded. An st.c 
always updates fsr bits 21.. 17 and 8..0 directly. 
The U-bit does not remain set; it always appears 
as zero when read. 

• The FTE (Floating-Point Trap Enable) bit. If clear, 
disables all floating-point traps (invalid input oper- 
and, overflow, underflow, and inexact result). 

• SI (Sticky Inexact) is set when the last-stage re- 
sult of either the multiplier or adder is inexact (i.e. 
when either Al or Ml is set). SI is “sticky” in the 
sense that it remains set until reset by software. 
Al and Ml, on the other hand, can by changed by 
the subsequent floating-point instruction. 

• SE (Source Exception) is set when one of the 
source operands of a floating-point operation is 
invalid; it is cleared when all the input operands 
are valid. Invalid input operands include denor- 
mals, infinities, and all NaNs (both quiet and sig- 
naling). 

• When read from the fsr, the result-status bits MA, 
Ml, MO, and MU (Multiplier Add-One, Inexact, 
Overflow, and Underflow, respectively) describe 
the last-stage result of the multiplier. 

When read from the fsr, the result-status bits AA, 
Al, AO, AU, and AE (Adder Add-One, Inexact, 
Overflow, Underflow, and Exponent, respectively) 
describe the last-stage result of the adder. The 
high-order three bits of the 1 1 -bit exponent of the 
adder result are stored in the AE field. 

The Adder Add-One and Multiplier Add-One bits 
indicate that the absolute value of the result frac- 
tion grew by one least-significant bit due to 
rounding. AA and MA are not influenced by the 
sign of the result. 

After a floating-point operation in a given unit (ad- 
der or multiplier), the result-status bits of that unit 
are undefined until the point at which result ex- 
ceptions are reported. 

When written to the fsr with the U-bit set, the 
result-status bits are placed into the first stage of 
the adder and multiplier pipelines. When the 
processor executes pipelined operations, it prop- 
agates the result-status bits of a particular unit 
(multiplier or adder) one stage for each pipelined 
floating-point operation for that unit. When they 
reach the last stage, they replace the normal re- 
■ suit-status bits in the fsr and generate traps, if 
enabled. When the U-bit is not set, result-status 
bits in the word being written to the fsr are Ig- 
nored. 


In a floating-point dual-operation Instruction (e.g. 
add- and-multiply or subtract-and-multiply), both 
the multiplier and the adder may set exception 
bits. The result-status bits for a particular unit re- 
main set until the next operation that uses that 
unit. 


• RR (Result Register) specifies which floating- 
point register (f0-f31) was the destination register 
when a result-exception trap occurs due to a sca- 
lar operation. 


• IRP (Integer (Graphics) Pipe Result Precision), 
MRP (Multiplier Pipe Result Precision),! and ARP 
(Adder Pipe Result Precision) aid in restoring 
pipeline state after a trap or process switch. Each 
defines the precision of the last-stage result in 
the corresponding pipeline. One of these bits is 
set when the result in the last stage of the corre- 
sponding pipeline is double precision; it is cleared 
if the result Is single precision. 

• LRP1 and LRPO (Load Pipe Result Precision) to- 
gether define the size of the last-stage result of 
the load pipeline. They are encoded as Table 2.6 
shows. 


2 


Table 2.6. Values of LRP1 and LRPO 


LRP1 

LRPO 

pfid Length 

0 

0 

(reserved) 

0 

1 

4 Bytes 

1 

0 

8 Bytes 

1 

1 

1 6 Bytes 


2.2.9 KR, Kl, T, AND MERGE REGISTERS 

The KR, Kl, and T registers are special-purpose reg- 
isters used by the dual-operation floating-point in- 
structions pfam, pfsm, pfmam, and pfmsm, which 
initiate both an adder operation and a multiplier op- 
eration. The KR, Kl, and T registers can store values 
from one dual-operation instruction and supply them 
as inputs to subsequent dual-operation instructions. 
(Refer to Figure 2.16.) 

The MERGE register Is used only by the graphics 
instructions. The purpose of the MERGE register is 
to accumulate (or merge) the results of multiple-ad- 
dition operations that use as operands the color-in- 
tensity values from pixels or distance values from a 
Z-buffer. The accumulated results can then be 
stored in one 64-bit operation. 

Two multiple-addition instructions and an OR in- 
struction use the MERGE register. The addition in- 
structions are designed to add interpolation values 
to each color-intensity field in an array of pixels or to 
each distance value in a Z-buffer. 
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Refer to the instruction descriptions in section 1 0 for 
more information about these registers. 

2.2.10 BUS ERROR ADDRESS REGISTER 

The bear helps the trap handler determine faulty 
memory locations. The i860 XP microprocessor 
loads a valid address into bear under these condi- 
tions: 

• For bus errors, the bear receives the address of 
the cycle for which the BERR signal is asserted, if 
external hardware asserts BERR In the same 
clock as it asserts BRDY # or one clock later. 

• For parity errors on a read, the bear receives the 
address of the cycle during which the processor 
detects the error, if external hardware asserts 
PEN# with BRDY# for that cycle. 

If external hardware does not meet these conditions, 
the contents of the bear are undefined. 

A valid address in bear is accurate to 29 bits; that is, 
address signals A31 - A3 are latched in the high-or- 
der 29 bits of bear. At RESET and after every parity 
and bus error trap, software must read the bear be- 
fore further parity and bus error traps can occur. The 
bear is a read-only register. 

2.2.11 PRIVILEGED REGISTERS 

The registers pO, pi p2, and p3 are provided for the 
operating system to use. They do not affect proces- 
sor operation. They can be accessed by the Id.c and 
st.c Instructions, but they can be written only In su- 
pervisor mode. They may be used to store informa- 
tion such as the interrupt stack pointer, current user 
stack pointer at the beginning of the trap handier, 
register values during trap handling, processor ID in 
a multiprocessor system, or for any other purpose. 


2.2.12 CONCURRENCY CONTROL REGISTER 

The concurrency control register (ccr) controls the 
operation of the internal Concurrency Control Unit 
(CCU), which Is described in section 2.5. The ccr 
can be written In supervisor mode only, but can be 
read in user or supervisor mode. Figure 2.8 shows 
the format of the ccr. 

DO (Detached Only) bit and CO (CCU On) bit togeth- 
er specify the CCU configuration. DO, when set, indi- 
cates that there is no external CCU. CO (CCU On) 
bit, when set, indicates that the Concurrency Control 
Architecture is enabled. Table 2.7 summarizes the 
modes defined by CO and DO bits. The reserved 
combinations should not be used by software. 

If the DCCU Is on (CO = DO=1), the processor In- 
tercepts and Interprets all memory loads and stores 
which are to the CCU address space, which is the 
two pages defined by CCUBASE. Loads and stores 
to that address range do not go to memory, but to 
the DCCU. 


Table 2.7. Values of CO and DO 


CO 

DO 

Mode 

0 

0 

External CCU, or no CCU 

0 

1 

reserved 

1 

0 

reserved 

1 

1 

Internal CCU (DCCU) only 


CCUBASE is the virtual address of the memory area 
into which the CCU registers are mapped. Software 
must set bit 12 to zero, because the CCUBASE must 
be aligned on a two page (8 Kbyte) boundary. This is 
because an external CCU contains supervisor regis- 
ters mapped to the second page. 


DETACHED ONLY 
CCU ON 
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Figure 2.8. Concurrency Control Register 
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2.2.13 NEWCURR REGISTER 

The NEWCURR register is part of the detached CCU 
(concurrency control unit). It a 32-blt counter that 
supplies an iteration count for loop execution. (Refer 
to section 2.5.) 

NEWCURR is architecturally a 64-bit register, but 
only the low-order 32 bits are provided in this imple- 
mentation. Compiler and operating-system data 
structures should provide for a 64-bit size for future 
implerrientation. 

2.2.14 STAT REGISTER 


Normally, multibyte data values are stored in memo- 
ry in little endian format, i.e. with the least significant 
byte at the lowest memory address. As an option, 
the ordering can be dynamically selected by soft- 
ware in supervisor mode. The i860 XP microproces- 
sor also offers big endian mode, in which the most 
significant byte of a data item is at the lowest ad- 
dress. Figure 2.10 defines by example how data is 
transferred from memory over the bus into a register 
in both modes. Big endian and little endian data ar- 
eas should not be mixed within a 64-bit data word. 
Illustrations of data structures In this data sheet 
show data stored in little endian mode, i.e. the right- 
most (low-order) byte is at the lowest memory ad- 
dress. 


The STAT register is part of the detached CCU (con- 
currency control unit). As Figure 2.9 shows, it con- 
tains the following bits: 

InLoop Indicates that the processor is currently 
executing a concurrent loop. This bit is 
set when a processor starts a concur- 
rent, non-nested loop, and it is cleared 
when the processor enters serial code 
when not nested or idle. It can also be 
read or written directly. 

Nested Indicates whether the processor is In the 
nested state. InLoop is copied Into this 
bit when starting a nested loop. Other- 
wise, it can be read or written directly. 

Detached Always contains the value of ccr bit DO. 


Code accesses are always done with little endian 
addressing. This implies that instructions appear dif- 
ferently than documented here when accessed as 
big endian data. Intel Corporation recommends that 
disassemblers running In a big endian system con- 
vert instructions that have been read as data back to 
little endian form and present them in the format 
documented here. 



Page directories and page tables are also accessed 
In little endian mode, regardless of the value of the 
BE bit. 


Big endian mode affects not only the memory load 
and store instructions but also the Idio, stio, Idint, 
and scyc instructions. 


STAT Is architecturally a 64-bit register. Compiler 
and operating-system data structures should provide 
for a 64-bit size for future Implementation. 


2.3 Addressing 

Memory is addressed in byte units with a paged vir- 
tual-address space of 232 bytes. Data and instruc- 
tions can be located anywhere in this address 
space. Address arithmetic uses 32-blt input values 
and produces 32-bit results. The low-order 32 bits of 
the result are used In case of overflow. 


Alignment requirements are as follows (any violation 

results in a data-access trap): 

• 128-bit values are aligned on 16-byte boundaries 
when referenced in memory (I.e. the four least 
significant address bits must be zero). 

• 64-bit values are aligned on 8-byte boundaries 
when referenced in memory (I.e. the three least 
significant address bits must be zero). 

• 32-bit values are aligned on 4-byte boundaries 
when referenced in memory (I.e. the two least 
significant address bits must be zero). 


InLoop — 
Nested — 
Detached 
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Figure 2.9. Concurrency Status Register 
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64- and 128-bit big endian accesses are treated the same as little endian accesses 



Figure 2.10. Little and Big Endian Memory Transfers 


• 16-bit values are aligned on 2-byte boundaries 
when referenced in memory (i.e. the least signifi- 
cant address bit must be zero). 


bit must be set if the operating system is to imple- 
ment page-oriented protection or page-oriented vir- 
tual memory. 


2.4 Virtual Addressing 

When address translation is enabled, the processor 
maps Instruction and data virtual addresses into 
physical addresses before referencing memory. This 
address transformation is compatible with that of the 
Intel386 and Intel486 microprocessors and imple- 
ments the basic features needed for page-oriented 
virtual-memory systems and page-level protection. 

The address translation is optional. Address transla- 
tion Is disabled when the processor Is reset. It Is 
enabled when a store (st.c) to dirbase sets the ATE 
bit. The operating system typically does this during 
software initialization. Address translation is dis- 
abled again when st.c clears the ATE bit. The ATE 


2.4.1 PAGE FRAME 

A page frame is a unit of contiguous addresses of 
physical main memory. A page Is the collection of 
data that occupies a page frame when that data is 
present in main memory or occupies some location 
In secondary storage when there is not sufficient 
space In main memory. 

The i860 XP microprocessor architecture supports 
two sizes of pages and page frames: four Mbytes 
and four Kbytes. Four Kbyte page frames begin on 
four Kbyte boundaries and are fixed in size. Four 
Mbyte page frames begin on four Mbyte boundaries 
and are fixed in size. The four Kbyte address trans- 
formation Is compatible with that of the Intel 486 mi- 
croprocessor. 
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2.4.2 VIRTUAL ADDRESS 

A virtual address refers indirectly to a physical ad- 
dress by specifying a page and an offset within that 
page. Figure 2.11 shows the formats of virtual ad- 
dressess. The format for virtual addresses that refer 
to four Mbyte pages Is different from that of four 
Kbyte pages. 

Figure 2.12 shows how the i860 XP microprocessor 
converts a virtual address into the physical address 
by consulting page tables. The addressing mecha- 
nism uses the DIR field as an index into a page di- 
rectory. For 4K pages, It uses the PAGE field as an 
index into the page table determined by the page 
directory and uses the OFFSET field to address a 
byte within the page determined by the page table. 
For 4M pages, the page directory entry determines 
the page address, and the OFFSET field addresses 
a byte within that page table. 


2.4.3 PAGE TABLES 

A page table is simply an array of 32-bit page specifi- 
ers. A page table is itself a page, and contains 
4 Kbytes of data or at most 1 K 32-bit entries. 

At the highest level is a page directory. The page 
directory holds up to IK entries that address either 
page tables of the second level or 4-Mbyte pages. 

A page table of the second level addresses up to 1 K 
4-Kbyte pages. All the tables addressed by one 
page directory, therefore, can address 1M 4-Kbyte 
pages. 

Whether 4-Mbyte pages, 4-Kbyte pages, or some 
combination of the two are used, one page directory 
can cover the entire four gigabyte physical address 
space of the i860 XP microprocessor (1 K page di- 
rectory entries x 4M page or 1 K page directory en- 
tries X 1 K page table entries x 4K page). 
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Figure 2.11. Formats of Virtual Addresses 
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Figure 2.12. Address Translation 
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The physical address of the current page directory is 
stored in the DTB field of the dirbase register. Mem- 
ory management software has the option of using 
one page directory for all processes, one page direc- 
tory for each process, or some combination of the 
two. 


2.4.4 PAGE-TABLE ENTRIES 

Page-table entries (PTEs) have one of the formats 
shown by Figure 2.13. 


2.4.4. 1 Page Frame Address 

The page frame address specifies the physical start- 
ing address of a page. In a page directory, the page 
frame address is either the address of a page table 
or the address of the four Mbyte page frame that 
contains the desired memory operand. In a second- 
level page table, the page frame address is the ad- 
dress of the 4-Kbyte page frame that contains the 
desired memory operand. 
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Figure 2.13. Formats of Page Table Entries 
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2AA.2 Present Bit 

The P (present) bit indicates whether a page table 
entry can be used in address translation. P = 1 indi- 
cates that the entry can be used. When P = 0 in ei- 
ther level of page tables, the entry is not valid for 
address translation, and the rest of the entry is avail- 
able for software use; none of the other bits in the 
entry is tested by the hardware. If P = 0 in either lev- 
el of page tables when an attempt is made to use a 
page-table entry for address translation, the proces- 
sor signals either a data-access fault or an instruc- 
tion-access fault. In software systems that support 
paged virtual memory, the trap handler can bring the 
required page into physical memory. 

Note that there is no P bit for the page directory 
itself. The page directory may be not-present while 
the associated process Is suspended, but the oper- 
ating system must ensure that the page directory 
indicated by the dirbase Image associated with the 
process Is present in physical memory before the 
process is dispatched. 


2.4.4.3 Writable and User Bits 

The W (writable) and U (user) bits are used for page- 
level protection, which the i860 XP microprocessor 
performs at the same time as address translation. 
The concept of privilege for pages is Implemented 
by assigning each page to one of two levels: 

Supervisor level For the operating system 
/y = Q\ and other systems software 

and related data. 

User level (U = 1) For applications procedures 
and data. 

The U bit of the psr indicates whether the i860 XP 
microprocessor is executing at user or supervisor 
level. The i860 XP microprocessor maintains the 
U bit of psr as follows: 

• The I860 XP microprocessor clears the psr U bit 
to indicate supervisor level when a trap occurs 
(Including when the trap instruction causes the 
trap). The prior value of U is copied into PU. 

• The i860 XP microprocessor copies the psr 
PU bit Into the U bit when an indirect branch is 
executed and one of the trap bits is set. If PU was 
one, the i860 XP microprocessor enters user lev- 
el. 

With the U bit of psr and the W and U bits of the 
page table entries, the i860 XP microprocessor im- 
plements the following protection rules: 

• When at user level, a read or write of a supervi- 
sor-level page causes a trap. 


• When at user level, a write to a page whose W bit 
Is not set causes a trap. 

• When at user level, a store (st.c) to certain con- 
trol registers is ignored. 

• When at user level, privileged instructions (Idio, 
stio, scyc, Idint) have no effect. 


When the i860 XP microprocessor is executing at 
supervisor level, all pages are addressable, but, 
when it is executing at user level, only pages that 
belong to the user level are addressable. 


When the i860 XP microprocessor is executing at 
supervisor level, all pages are readable. Whether a 
page is writable depends upon the write-protection 
mode controlled by WP of epsr: 

WP = 0 All pages are writable. 

WP =1 A write to page whose W bit is not set 
causes a trap. 



When the i860 XP microprocessor is executing at 
user level, only pages that belong to user level and 
are marked writable are actually writable; pages that 
belong to supervisor level are neither readable nor 
writable from user level. 


2.4.4.4 Write-Through Bit 

The i860 XP microprocessor implement both write- 
back and write-through caching policies for the on- 
chip instruction and data caches. If WT is set, the 
write-through policy is applied to data from the cor- 
responding page. If WT Is clear, the normal write- 
back policy is applied to data from the page. 

For four-Mbyte pages, the WT bit of the page direc- 
tory entry is used. For four-Kbyte pages, only the WT 
bit of the second-level page table entry is used; the 
WT bit of the page directory entry is not referenced 
by the processor, but is reserved. 

The value of the WT bit is driven externally on the 
PWT pin, so that external caches can employ the 
same policy used Internally. 


2.4.4.5 Cache Disable Bit 

If a page’s CD (cache disable) bit is set, data from 
the page is not placed in the internal instruction or 
data caches (regardless of the value of the WT bit). 
Clearing CD permits the processor to place data 
from the associated page into internal caches. 

For four-Mbyte pages, the CD bit of the page direc- 
tory entry Is used. For four-Kbyte pages, only the CD 
bit of the second-level page table entry is used; the 
CD bit of the page directory entry is not referenced 
by the processor, but Is reserved. 
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The value of the CD bit is driven externally on the 
PCD pin, so that cacheability can be the same in 
both internal and external caches. 


2.4.4.6 Accessed and Dirty Bits 

The A (accessed) and D (dirty) bits provide data 
about page usage in both levels of the page tables. 

The i860 XP microprocessor sets the A-bit before a 
read or write operation to a page. For four-Kbyte 
pages, it sets the A-bit of both levels of page tables. 

The processor tests the dirty bit before a write, and, 
under certain conditions, causes traps. The trap 
handler then has the opportunity to maintain appro- 
priate values in the dirty bits. For four-Mbyte pages, 
the D bit of the page directory entry is used. For four- 
Kbyte pages, only the D bit of the second-level page 
table entry is used; the D bit of the page directory 

fintry is nnt refemnreH hy thp nrnrpssnr hut is 

reserved. The precise algorithm for using these bits 
is specified in section 2.4.5. 

An operating system that supports paged virtual 
memory can use the D and A bits to determine what 
pages to eliminate from physical memory when the 
demand for memory exceeds the physical memory 
available. The D and A bits are normally Initialized to 
zero by the operating system. The processor sets 
the A bit when a page Is accessed either by a read 
or write operation. When a data-access fault occurs, 
the trap handler sets the D bit if an allowable write is 
being performed, then reexecutes the instruction. 

The operating system is responsible for coordinating 
its updates to the accessed and dirty bits with up- 
dates by the CPU and by other processors that may 
share the page tables. The i860 XP microprocessor 
automatically asserts the LOCK# signal while test- 
ing and setting the A bit. 


2.4.4.7 Page Tables for Trap Handlers 

When paging is enabled (ATE = 1), software that 
creates page tables and directories must assure that 
A = 1 always in the PTEs and PDEs for the code 
pages of the trap handler and the first data page 
accessed by the handler. Preallocation of these 
pages is required In case a trap occurs during a lock 
sequence. Otherwise, recursive traps would be gen- 
erated, as the A-bit would need to be set by the 
translation hardware, which Is a trapping situation in 
itself. 


2.4.4.8 Combining Protection of Both Levels of 
Page Tables 

For any four-Kbyte page, the protection attributes of 
its page directory entry may differ from those of its 
page table entry. The i860 XP microprocessor com- 
putes the effective protection attributes for a page 
by examining the protection attributes in both the 
directory and the page table and choosing the more 
restrictive of the two. 


2.4.5 ADDRESS TRANSLATION ALGORITHM 

The following algorithm defines the translation of 
each virtual address to a physical address. Let DIR, 
PAGE, and OFFSET be the fields of the virtual ad- 
dress; let PFA1 and PFA2 be the page frame ad- 
dress fields of the first and second level page tables 
respectively; DTB Is the page directory table base 
address stored In the dirbase register. 

1. Reau ui« FDE (Faye Direuiury Eiilry) ai ihe 
physical address formed by DTB;DIR:00. 

2. If P in the PDE is zero, generate a data- or in- 
struction-access fault. 

3. If W in the PDE is zero, the operation is write, 
and either the U bit of the PSR is set or WP = 1 , 
generate a data-access fault. 

4. If the U bit in the PDE is zero and U bit in the psr 
is set, generate a data- or instruction-access 
fault. 

5. If A in the PDE is zero and the TLB miss oc- 
curred inside a locked sequence, generate a 
data or instruction access fault. (The trap allows 
software to set A to one and restart the se- 
quence. This helps external bus hardware deter- 
mine unambiguously what address corresponds 
to a locked semaphore.) 

6. If bit 7 of the PDE is one (four Mbyte page), and 
the operation Is write, and D = 0 in the PDE, 
generate a data-access fault. 

7. If A = 1 in the PDE, continue at step 1 1 . Other- 
wise, assert LOCK#. 

8. Perform the PDE read as in step 1 and the P, W 
and U bit checks as In steps 2 through 4. 

9. Write the PDE with A bit set. 

10. Deassert LOCK#. 

11. If bit 7 of the PDE is one (four Mbyte page), form 
the physical address as PFA1 :OFFSET, and exit 
address translation. In this case, PFA1 is 10 bits 
and OFFSET is 22 bits. 

12. The remaining steps are for four Kbyte pages. If 
the A-bit in the PDE was zero before translation 
began, assert LOCK#. 
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13. Fetch the PTE at the physical address formed 
by PFA1:PAGE:00. 

14. Perform the P-, W-. U-, and A-bit checks as in 
steps 2 through 5 with the second-level PTE. If 
A = zero in the PTE, and the TLB miss oc- 
curred inside a locked sequence, generate a 
data or instruction access fault. LOCK# re- 
mains active. 

15. If the operation is write, and D in the PTE is 
zero, generate a data access fault. 

16. If the A-bit in the PDE was already active before 
translation began, and the A-bit in the PTE is 
already active, go to step 20. 

17. If LOCK# is not already active, assert it and 
refetch the PTE. 

18. Perform the L)-, W-, and P-bit checks and A-bit 
setting in the PTE as in steps 8 through 9. Do 
the locked write update of the PTE to unlock the 
bus, even if the A-bit in the PTE is already one. 

19. Deassert LOCK#. 

20. Form the physical address as PFA2:OFFSET. In 
this case, PFA2 is 20 bits and OFFSET is 12 
bits. 


ing among CPUs, in multiprocessor systems. The 
ecu is a VLSI chip that allows multiple processors 
to work together to execute portions of a single pro- 
gram In parallel. The CCU performs the iteration as- 
signment for loop parallelization. Accesses to the 
CCU for synchronization are much faster than ac- 
cesses to shared memory semaphores. The CCU is 
memory mapped, and Its Internal registers are ac- 
cessed via memory load and store operations. 


To take advantage of the parallel architecture, soft- 
ware must be compiled by parallelizing compilers 
that generate instructions to access the CCU. How- 
ever, such instructions cannot run on a system that 
does not include a CCU. To allow an application 
compiled for parallel execution to run on any system 
based on the i860 XP microprocessor, a “Detached 
Only” CCU (DCCU, also referred to as “internal 
CCU”) is Implemented in the I860 XP microproces- 
sor. The DCCU Is a compatible subset of the exter- 
nal CCU, consisting of the minimal set of features 
required for a single CPU. The DCCU alone neither 
increases performance nor concurrency, but does 
allow software designed for parallel processing to 
run unmodified on a single CPU. 



During translation, the i860 XP microprocessor looks 
only in external memory for page directories and 
page tables. The data cache is not searched. There- 
fore, any code that modifies page directories or 
page tables must keep them out of the cache. The 
tables should either be kept in noncacheable memo- 
ry or in write-through pages or should be flushed 
from the cache. 

The i860 XP microprocessor expects page directo- 
ries and page tables to be in little endian format. The 
operating system must maintain these tables in little 
endian format either by setting BE to zero when ma- 
nipulating the tables or by complementing bit two of 
the 32-bit address when loading or storing entries. 

2.4.6 ADDRESS TRANSLATION FAULTS 

The address translation fault can be signalled as ei- 
ther an Instruction access fault or a data-access 
fault. The instruction causing the fault can be reexe- 
cuted upon returning from the trap handler. 


2.5 Detached CCU 

The i860 XP microprocessor supports parallel pro- 
cessing, where multiple processors work simulta- 
neously on different parts of the same problem. The 
Concurrency Control Unit (CCU) controls work shar- 


2.5.1 DCCU INITIALIZATION 

After reset, the I860 XP microprocessor DCCU Is dis- 
abled (CO and DO bits In ccr are cleared). To en- 
able the DCCU, the CO and DO bits in ccr must be 
set by software. Before turning on the CCU, the op- 
erating system must invalidate the TLB and flush the 
data cache to make sure that they do not contain 
data from the CCU pages. The TLB is invalidated by 
setting ITI = 1 in the dirbase register. Also, the 
flush instruction must be used once per each line of 
the data cache to Invalidate the physical address of 
the cache entry, if the two pages at the CCUBASE 
address may have been cached. The flush is un- 
needed if page tables or external hardware have 
prohibited caching of the CCUBASE pages. 

Neither the external CCU nor the DCCU can be ac- 
cessed within four instructions after ccr is modified. 


2.5.2 DCCU ADDRESSING 

The CCU facilities are memory-mapped, manipulat- 
ed by normal load and store Instructions. The DCCU 
is memory-mapped to a single 4 Kbyte user page. 
When the DCCU is active, all accesses to this page 
are satisfied by the DCCU, and no external bus cycle 
Is generated. The address space of two adjacent 
pages beginning on an 8 Kbyte boundary is reserved 
for the CCU. The first (lower address) page contains 
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locations accessible In user mode (which includes 
the DCCU registers), and the second page contains 
locations accessible in supervisor mode (used for 
external CCU only). The base address of these 
pages is specified by the CCU BASE field in ccr. Ac- 
cesses to the second page in DCCU-only mode 
have no effect on the DCCU, and are treated as 
normal memory accesses. 

When the DCCU is active, accesses to its address 
page use only the virtual address, and no translation 
is done on the DCCU access. However, the access- 
es to an external CCU go through normal address 
translation. The operating system should make sure 
that the page table entries for the CCU pages are 
set so that no fault occurs during address transla- 
tion. If an external CCU is used, the two PTEs for the 
CCU should have CD = 1 (caching disabled) and 
page frame addresses that match the external hard- 
ware addresses of the CCU. Accesses to the DCCU 
that cause a TLB miss do not cause the PTE to be 
loaded into tne i Lb. 

If the external CCU is used when address translation 
is disabled (ATE = 0), external hardware must deac- 
tivate KEN# for such accesses, to avoid caching 
external CCU accesses. 

2.5.3 DCCU INTERNALS 

The DCCU consists of an address decoder, a 32-bit 
counter (NEWCURR), and three bits of state infor- 
mation (InLoop, Nested, and Detached). InLoop, 
Nested and Detached correspond to bits 0, 1 , and 2 
respectively of the external CCU STAT register. The 
Detached bit always reflects the value of the DO bit 
in ccr. 

Several addresses within the DCCU memory page 
are decoded to cause actions to NEWCURR, In- 
Loop, and Nested state bits. The CCU register to be 
accessed is specified by address bits 11-3. The val- 
id CCU addresses are shown in Table 2.8 with their 
mnemonics. Accesses to these address may also 
have side effects within the DCCU. Refer to the 
i860^^ Microprocessor Family Programmer's Refer- 
ence Manual for programming Information. Loads 
from any other addresses within the DCCU memory 
page return zero; stores to any other addresses 
have no effect. Access to the DCCU by any load or 
store instructions other than Id.x and st.x produce 
undefined results. 

Assemblers should encode address bits 2-0 as zero 
for accesses In little-endian mode. However, in big- 
endian mode (epsr BE bit =1), DCCU accesses 
should have address bit 2 active. Thus, software for 


big-endian access to the DCCU must differ from lit- 
tle-endian software. That allows an external CCU to 
be accessed in both big and little endian modes. 

When reading from the DCCU, the access latency Is 
the same as reading data from the data cache— the 
data is ready for use as a source by the second 
instruction after the load. The first instruction after 
the load may use the data, but that instruction will 
experience a one-clock freeze before the data be- 
comes available. 


2.6 Instruction Set 

Table 2.9 shows the complete set of instructions for 
the i860 XP microprocessor, grouped by function 
within processing unit. Refer to Section 10 for an 
algorithmic definition of each instruction. The in- 
struction set of the i860 XP microprocessor is fully 
upward compatible with that of the i860 XR micro- 
processor, extended in a few ways to better serve 
certain application domains. User-level software ap- 
plications written for the i860 XR microprocessor will 
run unmodified on the I860 XP microprocessor, but 
some supervisor code (for example, trap handlers) 
may need minor modifications. The i860 XR micro- 
processor instruction set has been extended with 
the following instructions: 

• Idio, stio: I/O load and store instructions 

• Idint: Load interrupt instruction to perform an In- 
terrupt acknowledge cycle and read the Interrupt 
vector. Used to emulate the Intel 486 interrupt 
acknowledge sequence. 

• scyc: A special-cycle instruction, used to gener- 
ate bus cycles that signal invalidation and syn- 
chronization of an external cache. 

• pfid.q: A pipelined, floating-point load of 1 28 bits. 


Table 2.8. CCU Addresses 


Mnemonic 

A11-A8 

A7-A4 

Little 

Endian 

A3-A0 

Big 

Endian 

A3-A0 

cbr__j’ 

0000 

Oabc 

bOOO 

dlOO 

eget 

1111 

0110 

0000 

0100 

cnewcurr 

1111 

1100 

0000 

0100 

cstat 

1111 

1100 

1000 

1100 

estate! 

1111 

1101 

0000 

0100 

cstatn 

1111 

1101 

1000 

1100 

ccim 

1111 

1110 

1000 

1100 

ever 

1111 

1111 

1000 

1100 


NOTE: 

Variable i is a 4-blt index formed by A6-A3. Let its binary 
form be represented by the symbols abed. 
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Table 2.9. Instruction Set (1 of 2) 


Core Unit 

Mnemonic 

Description 

Load and Store Instructions 

Id.x 

Load integer 

st.x 

Store integer 

fld.y 

F-P load 

fst.y 

F-P store 

pfid.y 

Pipelined F-P load 

pst.d 

Pixel store 

Register to Register Move 

ixfr 

Transfer integer to F-P register 

Integer Arithmetic Instructions 

addu 

Add unsigned 

adds 

Add signed 

subu 

Subtract unsigned 

subs 

Subtract signed 

Shift Instructions 

shl 

Shift left 

shr 

Shift right 

shra 

Shift right arithmetic 

shrd 

Shift right double 

Logical Instructions 

and 

Logical AND 

andh 

Logical AND high 

andnot 

Logical AND NOT 

andnoth 

Logical AND NOT high 

or 

Logical OR 

orh 

Logical OR high 

xor 

Logical exclusive OR 

xorh 

Logical exclusive OR high 

Control-Transfer Instructions 

br 

Branch direct 

bri 

Branch indirect 

be 

Branch on CC 

bc.t 

Branch on CC taken 

bnc 

Branch on not CC 

bnc.t 

Branch on not CC taken 

bte 

Branch if equal 

btne 

Branch if not equal 

bla 

Branch on LCC and add 

call 

Subroutine call 

calli 

Indirect subroutine call 

intovr 

Software trap on integer overflow 

trap 

Software trap 


Floating-Point Unit 

Mnemonic 

Description 

Register to Register Move 

fxfr 

Transfer F-P to integer register 

F-P Multiplier Instructions 

fmul.p 

pfmul.p 

pfmul3.dd 

fmlow.p 

frep.p 

fsqr.p 

F-P multiply 

Pipelined F-P multiply 

3-Stage pipelined F-P multiply 

F-P multiply low 

F-P reciprocal 

F-P reciprocal square root 

F-P Adder Instructions 

fadd.p 

pfadd.p 

famov.r 

pfamov.r 

fsub.p 

pfsub.p 

pfgt.p 

pfeq.p 

fix.v 

pfix.v 

ftrunc.v 

F-P add 

Pipelined F-P add 

F-P adder move 

Pipelined F-P adder move 

F-P subtract 

Pipelined F-P subtract 

Pipelined greater-than compare 
Pipelined equal compare 

F-P to Integer conversion 

Pipelined F-P to integer conversion 
F-P to Integer truncation 

Dual-Operation Instructions 

pfam.p 

pfsm.p 

pfmam.p 

pfmsm.p 

Pipelined F-P add and multiply 
Pipelined F-P subtract and multiply 
Pipelined F-P multiply with add 
Pipelined F-P multiply with subtract 

Long Integer Instructions 

fisub.z 

pfisub.z 

fiadd.z 

pfiadd.z 

Long-integer subtract 

Pipelined long-integer subtract 
Long-integer add 

Pipelined long-integer add 

Graphics Instructions 

fzchks 

pfzchds 

fzchki 

pfzchki 

faddp 

pfaddp 

faddz 

pfaddz 

form 

pform 

16-blt Z-buffer check 

Pipelined 1 6-bit Z-buffer check 

32-blt Z-buffer check 

Pipelined 32-bit Z-buffer check 

Add with pixel merge 

Pipelined add with pixel merge 

Add with Z merge 

Pipelined add with Z merge 

OR with MERGE register 

Pipelined OR with MERGE register 
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Table 2.9. Instruction Set (2 of 2) 


Core Unit 

Mnemonic 

Description 

I/O Instructions 

Idio.x 

Load I/O 

stio.x 

Store I/O 

Idint.x 

Load Interrupt vector 

System Control Instructions 

flush 

Cache flush 

Id.c 

Load from control register 

st.c 

Store to control register 

lock 

Begin interlocked sequence 

unlock 

End interlocked sequence 

scyc.x 

Special bus cycles 

Assembler Pseudo-Operations 

tn Ronjetor 

mov 

Integer move 

fmov.r 

F-P reg-reg move 

pfmov.r 

Pipelined F-P reg-reg move 

nop 

Core no-operatlon 

fnop 

F-P no-operation 

pfie.p 

Pipelined F-P less-than or equal 


The architecture of the i860 XP microprocessor uses 
parallelism to increase the rate at which operations 
may be introduced into the unit. Parallelism in the 
i860 XP microprocessor is not transparent; rather, 
programmers have complete control over parallel- 
ism and therefore can achieve maximum perform- 
ance for a variety of computational problems. 


2.6.1 PIPELINED AND SCALAR OPERATIONS 

One type of parallelism used within the floating-point 
unit is “pipelining”. The pipelined architecture treats 
each operation as a series of more primitive opera- 
tions (called “stages”) that can be executed In par- 
allel. Consider just the floating-point adder as an ex- 
ample. Let A represent the operation of the adder. 
Let the stages be represented by Ai, A 2 , and A 3 . 
The stages are designed such that Aj + i for one ad- 
der instruction can execute in parallel with A-, for the 
next adder instruction. Furthermore, each Aj can be 
executed In just one clock. The pipelining within the 
multiplier and graphics units can be described simi- 
larly, except that the number of stages may be differ- 
ent. 

Figure 2.14 illustrates three-stage pipelining as 
found in the floating-point adder (also In the floating- 
point multiplier when single-precision input operands 
are emnlnvari) Thp centra! columns of the tabic rep 
resent the three stages of the pipeline. Each stage 
holds intermediate results and also (when intro- 
duced into the first stage by software) holds status 
information pertaining to those results. The table as- 
sumes that the instruction stream consists of a se- 
ries of consecutive floating-point Instructions, all of 
one type (i.e. all adder instructions or all single-preci- 
sion multiplier instructions). The instructions are rep- 
resented as A, B, etc. The rows of the table repre- 
sent the states of the unit at successive clock cy- 
cles. Each time a pipelined operation is performed, 
the result of the last stage of the pipeline Is stored in 
the destination register fdest, the pipeline is ad- 
vanced one stage, and the input operands of the 
operation are transferred to the first stage of the 
pipeline. 


Clock 

Instruction 

Pipeline 

Result 

Stage 1 

Stage 2 

Stage 3 

1 

A 

A 




2 

B 

B 

A 



3 

C 

C 

B 

A 


4 

D 

D 

c 

B 

A fdest oiD 

5 

E 

E 

D 

C 

B fdestoAE 

6 

F 

F 

E 

D 

C fdestoi'F 


Figure 2.14. Pipelined Instruction Execution 
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In the i860 XP microprocessor, the number of pipe- 
line stages ranges from one to three. A pipelined 
operation with a three-stage pipeline stores the re- 
sult of the third prior operation. A pipelined operation 
with a two-stage pipeline stores the result of the sec- 
ond prior operation. A pipelined operation with a 
one-stage pipeline stores the result of the prior oper- 
ation. 

There are four floating-point pipelines: one for the 
multiplier, one for the adder, one for the graphics 
unit, and one for floating-point loads. The adder 
pipeline has three stages. The number of stages in 
the multiplier pipeline depends on the precision of 
the source operands in the pipeline; it may have two 
or three stages. The graphics unit has one stage for 
all precisions. The load pipeline has three stages for 
ail precisions. 

Changing the FZ (flush zero), RM (rounding mode), 
or RR (result register) bits of fsr while there are re- 
sults in either the multiplier or adder pipeline produc- 
es effects that are not defined. 


2.6.1. 1 Scalar Mode 

In addition to the pipelined execution mode, the 
i860 XP microprocessor also can execute floating- 
point Instructions in “scalar” mode. Most floating- 
point instructions have both pipelined and scalar 
variants, distinguished by a bit in the instruction en- 
coding. In scalar mode, the floating-point unit does 
not start a new operation until the previous floating- 
point operation is completed. The scalar operation 
passes through all stages of its pipeline before a 
new operation is introduced, and the result is stored 
automatically. Scalar mode is used when the next 
operation depends on results from the previous few 
floating-point operations (or when the compiler or 
programmer does not want to deal with pipelining). 


changes the result-status bits of the first stage of 
a particular unit (multiplier or adder), the updated 
result-status bits are propagated one stage for 
each pipelined floating-point operation for that 
unit. In this case, each stage of the adder and 
multiplier pipelines holds its own copy of the rele- 
vant bits of the fsr. When they reach the last 
stage, they override the normal result-status bits 
computed from the last-stage result. 


At the next floating-point instruction (or at certain 
core instructions), after the result reaches the last 
stage, the i860 XP microprocessor traps if any of the 
status bits of the fsr indicate exceptions. Note that 
the instruction that creates the exceptional condition 
is not the instruction at which the trap occurs. 


2.6. 1.3 Precision in the Pipelines 

In pipelined mode, when a floating-point operation is 
initiated, the result of an earlier pipelined floating- 
point operation is returned. The result precision of 
the current instruction applies to the operation being 
initiated. The precision of the value stored in fdest is 
that which was specified by the instruction that initia- 
ted that operation. 



If fdest is the same as fsrd or fsrc2, the value being 
stored in fdest is used as the input operand. In this 
case, the precision of fdest must be the same as the 
source precision. 


The multiplier pipeline has two stages when the 
source operands are double-precision and three 
stages when they are single. This means that a pipe- 
lined multiplier operation stores the result of the sec- 
ond previous multiplier operation for double-preci- 
sion Inputs and third previous for single-precision in- 
puts (except when changing precisions). 


2.6.1.2 Pipelining Status Information 


2.6.1. 4 Transition between Scalar and Pipelined 
Operations 


Result status information in the fsr consists of the 
AA, Al, AO, AU, and AE bits, in the case of the ad- 
der, and the MA, Ml, MO, and MU bits. In the case of 
the multiplier. This information arrives at the fsr via 
the pipeline in one of two ways: 

1 . It is calculated by the last stage of the pipeline. 
This is the normal case. 

2. It is propagated from the first stage of the pipe- 
line. This method Is used when restoring the 
state of the pipeline after a preemption. When a 
store Instruction updates the fsr and the the U bit 
being written into the fsr Is set, the store updates 
the result status bits in the first stage of both the 
adder and multiplier pipelines. When software 


When a scalar operation is executed, it passes 
through all stages of the pipeline; therefore, any un- 
stored results in the affected pipeline are lost. To 
avoid losing information, the last pipelined opera- 
tions before a scalar operation should be dummy 
pipelined operations that unload unstored results 
from the affected pipeline. 

After a scalar operation, the values of all pipeline 
stages of the affected unit (except the last) are un- 
defined. No spurious result-exception traps result 
when the undefined values are subsequently stored 
by pipelined operations; however, the values should 
not be referenced as source operands. 
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For best performance a scalar operation should not 
immediately precede a pipelined operation whose 
fdest is nonzero. 


2.6. 1.5 Pipelined Loads 

The pfid instruction is optimized for accesses that 
miss the data cache and transfer directly from mem- 
ory. Therefore, even when there Is a data cache hit, 
a pfid may generate a bus cycle. The data from the 
internal cache is used only if it was modified. Other- 
wise, data is taken from the external bus, even if it 
resides In the on-board cache. 

The pfid FIFO can be extended externally, due to 
the facts that a pfid always generates a bus cycle 
and that such a cycle can be Identified externally by 
the value on the CTYP pin. Software written for an 
externally-extended pfid pipeline must ensure that it 
does not pfid from a location that was modified in 
the data nanhe When a nfjrf cache hit tc 3 . modified 
line occurs, the pfid pipeline length used by the 
i860 XP microprocessor is three stages. The modi- 
fied data from the cache is put Into the internal 
three-stage data FIFO, and the third pfid Instruction 
after the data cache hit will update Its fdest register 
with the modified data. 


2.6.2 DUAL-INSTRUCTION MODE 

Another form of parallelism results from the fact that 
the i860 XP microprocessor can execute both a 


floating-point and a core instruction simultaneously. 
Such parallel execution is called dual-instruction 
mode. When executing in dual-instruction mode, the 
instruction sequence consists of 64-bit aligned in- 
struction pairs, with a floating-point instruction in the 
lower 32 bits and a core instruction in the upper 32 
bits. Table 2.9 identifies which Instructions are exe- 
cuted by the core unit and which by the floating- 
point unit. 

Programmers specify dual-instruction mode either 
by including In the mnemonic of a floating-point in- 
struction a d. prefix or by using the Assembler direc- 
tives .dual . . . .enddual. Both of the specifications 
cause the D-bit of floating-point instructions to be 
set. If the i860 XP microprocessor is executing in 
single-instruction mode and encounters a floating- 
point instruction with the D-bit set, one more 32-bit 
instruction is executed before dual-mode execution 
begins. If the i860 XP microprocessor Is executing in 
dual-instruction mode and a floating-point instruntinn 
Is encountered with a clear D-bit, then one more pair 
of Instructions is executed before resuming single-in- 
struction mode. Figure 2.15 Illustrates two variations 
of this sequence of events: one for extended se- 
quences of dual-instructions and one for a single in- 
struction pair. 

Note that d.fnop cannot be used to initiate dual in- 
struction mode. 
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Figure 2.15. Dual-Instruction Mode Transitions (1 of 2) 
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Figure 2.15. Dual-Instruction Mode Transitions (2 of 2) 
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When a 64-bit dual-instruction pair sequentially fol- 
lows a delayed branch instruction in dual-instruction 
mode, both 32-bit instructions are executed. 


2.6.3 DUAL-OPERATION INSTRUCTIONS 

Special dual-operation floating-point instructions 
(add-and-multiply, subtract-and-multiply) use both 
the multiplier and adder units within the floating- 
point unit in parallel to efficiently execute such com- 
mon tasks as evaluating systems of linear equa- 
tions, performing the Fast Fourier Transform (FFT), 
and performing graphics transformations. 

The instruction classes pfam fsrd, fsrc2, fdesU 
pfmam fsrd, fsrc2, fdest (add and multiply), pfsm 
fsrd, fsrc2, fdest, and pfmsm fsrd, fsrc2, fdest 
(subtract and multiply) Initiate both an adder opera- 
tion and a multiplier operation. Six operands are re- 
quired, but the instruction format specifies only three 
operands; therefore, there are special provisions for 
specifying the operands. These special provisions 
consist of; 

• Three special registers (KR, Kl, and T) that can 
store values from one dual-operation instruction 
and supply them as inputs to subsequent dual-op- 
eration instructions. 

— The constant registers KR and Kl can store 
the value of fsrd and subsequently supply 
that value to the multiplier pipeline in place of 
fsrd. 


— The transfer register T can store the last-stage 
result of the multiplier pipeline and subse- 
quently supply that value to the adder pipeline 
In place of fsrd. 

• A four-bit data-path control field in the opcode 

(DPC) that specifies the operands and loading of 

the special registers. 

1. Operand-1 of the multiplier can be KR, Kl, or 
fsrd. 

2. Operand-2 of the multiplier can be fsrc2, the 
last-stage result of the multiplier pipeline, or 
the last-stage result of the adder pipeline. 

3. Operand-1 of the adder can be fsrd, the 
T-regIster, the last-stage result of the multiplier 
pipeline, or the last-stage result of the adder 
pipeline. 

4. Operand-2 of the adder can be fsrc2, the last- 
stage result of the multiplier pipeline, or the 
last-stage result of the adder pipeline. 

Figure 2.16 shows all the possible data paths sur- 
rounding the adder and multiplier. The DPC field In 
these instructions selects different data paths. Sec- 
tion 10 shows the various encodings of the DPC 
field. 

Note that the mnemonics pfam.p, pfsm.p, 
pfmam.p, and pfmsm.p are never used as such In 
the assembly language; these mnemonics are used 
here to designate classes of related instructions. 
Each value of DPC has a unique mnemonic associ- 
ated with it. 


Single Precision 
3-Stage Multiplier and Adder 


fsrd fsrc2 fdest 
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Figure 2.16. Dual-Operation Data Paths 
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2.7 Addressing Modes 

Data access is limited to load and store instructions. 
Memory addresses are computed from ^o fields of 
load and store instructions; isrd and isrc2. 

1 . isrd either contains the identifier of a 32-bit inte- 
ger register or contains an immediate 1 6-blt ad- 
dress offset. 

2. isrc2 always specifies a register. 

Because either isrd or isrc2 may be null (zero), a 
variety of useful addressing modes result: 

offset + register Useful for accessing fields 
within a record, where register 
points to the beginning of the 
record. Useful for accessing 
items in a stack frame, where 
register is r3, the register used 
for pointing to the beginning of 
the stack frame. 

register + register Useful for two-dimensional ar- 
rays or for array access within 
the stack frame. 

register Useful as the end result of any 

arbitraryaddress calculation. 

offset Absolute address into the first 

or last 32K of the logical ad- 
dress space. 

In addition, the floating-point load and store instruc- 
tions may select autoincrement addressing. In this 
mode isrc2 is replaced by the sum of isrd and isrc2 
after performing the load or store. This mode makes 
stepping through arrays more efficient, because it 
eliminates one address-calculation instruction. 


2.8 T raps and Interrupts 

Traps are caused by exceptional conditions detect- 
ed In programs or by external Interrupts. Traps 
cause Interruption of normal program flow to exe- 
cute a special program known as a trap handler. 
Traps are divided Into the types shown in Table 2.10. 

2.8.1 TRAP HANDLER INVOCATION 

This section applies to traps other than reset. When 
a trap occurs, execution of the current instruction Is 
aborted. Except for bus error and parity error traps, 
the Instruction is restartable. The processor takes 
the following steps while transferring control to the 
trap handler: 

1 . Copies U (user mode) of the psr into PU (previ- 
ous U). 

2. Copies IM (Interrupt mode) into PIM (previous 
IM). 


3. Sets U to zero (supervisor mode). 

4. Sets IM to zero (interrupts disabled). 

5. If the processor is in dual instruction mode, It sets 
DIM; otherwise it clears DIM. 

6. If the processor is In single-instruction mode and 
the next Instruction will be executed in dual-in- 
struction mode or If the processor is in dual-in- 
struction mode and the next instruction will be 
executed in single-instruction mode, DS is set; 
otherwise. It is cleared. 

7. The appropriate trap type bits In psr and epsr are 
set (IT, IN, lAT, DAT, FT, OF, IL, PI, PT, BEF, 
PEF). Several bits may be set if the correspond- 
ing trap conditions occur simultaneously. 

8. An address is placed in the fault Instruction regis- 
ter (fir) to help locate the trapped Instruction. In 
single-instruction mode, the address in fir is the 
address of the trapped instruction Itself. In dual- 
instruction mode, the address in fir Is that of the 
uoaiiny-puinl hall of the dual insiruction. it an in- 
struction or data access fault occurred, the asso- 
ciated core instruction is the high-order half of 
the dual Instruction (fir + 4). In dual-instruction 
mode, when a data access fault occurs in the 
absence of other trap conditions, the floating- 
point half of the dual instruction will already have 
been executed (except in the case of the fxfr 
Instruction). 

The processor begins executing the trap handler by 
transferring execution to virtual address 
OxFFFFFFOO. The trap handler begins execution in 
single-instruction mode. The trap handler must ex- 
amine the trap-type bits In psr (IT, IN, lAT, DAT, FT) 
and epsr (OF, IL, PT, PI, BEF, PEF) to determine the 
cause or causes of the trap. 


2.8.2 INSTRUCTION FAULT 

This fault is caused by any of the following condi- 
tions. In all cases the processor sets the IT bit be- 
fore entering the trap handler. 

1 . By the trap instruction. When trap is executed In 
dual-instruction mode, the floating-point compan- 
ion of the trap instruction is not executed before 
the trap is taken. 

2. By the Intovr instruction. The trap occurs only if 
OF in epsr is set when Intovr is executed. To 
distinguish between cases 1 and 2, the trap han- 
dler must examine the instruction addressed by 
fir. The trap handler should clear OF before re- 
turning. When Intovr causes a trap in dual-in- 
struction mode, the floating-point companion of 
the Intovr instruction is completely executed be- 
fore the trap is taken. 
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Table 2.10. Types of Traps 


Type 

Indication 

Caused by 

psr 

epsr 

fsr 

Condition 

Instruction 

Instruction 

Fault 

IT 

OF 

IL 

PT&PI 


Software traps 

Missing unlock 

Pipeline usage 

trap 

Intovr 

Any 

Any scalar or pipelined 
instruction that uses a 
pipeline 

Floating 

Point 

Fault 

FT 


SE 

AO, MO 
AU,MU 
Al, Ml 

Floating-point source 
exception 

Floating-point result 
exception 
overflow 
underflow 
inexact result 

Any M- or A-unit except 

fmlow 

Any M- or A-unIt except 
fmlow, pfgt, and pfeq. 
Reported on any F-P 
Instruction, pst, fst, and 
sometimes fid, pfid, and 
ixfr 

Instruction 
Access Fault 

lAT 



Address translation 
exception during instruction 
fetch 

Any 

Data 

Access 

Fault 

DAT 



Load/store address 
translation exception 
Misaligned operand address 
Operand address matches 
db register 

Any load/store 

Any load/store 

Any load/store 


IN 

PEF 


Parity error on data pins during bus read operation 
when PEN# pin active 

Bus Error Fault 

IN 

BEF 


External interrupt signal on BERR pin 

Interrupt 

IN 

INT 


External interrupt signal on INT pin 

Reset 

None 

PEF, BEF 


Hardware RESET signal 


3. By violation of lock/unlock protocol, explained 
below. (Note that trap and intovr should not be 
used within a locked sequence; otherwise, It 
would be difficult to distinguish between this and 
the prior cases.) 

4. By execution of an instruction that uses a pipeline 
when the PT bit of epsr is set. (Refer to section 
2 . 8 . 2 . 2 .) 


2.8.2. 1 Lock Protocol 

The lock protocol requires the following sequence of 

activities: 

1. lock 

2. Any load or store instruction. For compatibility 
with future processor generations, this should be 
a load. 

3. unlock 

4. Any load or store instruction. For compatibility 
with future processor generations, this should be 
a store. 


There may be other instructions between any of 
these steps. The bus Is locked after step 2, and re- 
mains locked until step 4. Step 4 must follow step 1 
by 30 instructions or less; otherwise, an Instruction 
trap occurs. In case of a trap, IL Is also set. If the 
load or store instruction of step 2 accesses a previ- 
ously unaccessed page (A = 0), the bus is locked 
briefly while the A bit is set, unlocked, then locked 
again to satisfy the lock Instruction and start the 
locked sequence. 

2.8.2.2 Using PT and PI Bits 

The PI and PT bits are provided to help the trap 
handler avoid unnecessarily saving and restoring the 
pipelines (refer to the section "Pipeline Preemption" 
in the i860 Microprocessor Famiiy Programmer’s 
Reference Manual). 

Trap handlers that use PI or PT must Initially exam- 
ine fsr. If a pending trap exists — that is. If the FTE 
(floating-point trap enable) bit Is set and any of the 
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floating-point exception bits (Al, AO, AU, Ml, MO, 
MU) is active — the trap handler must save the pipe- 
lines. The i860 XP microprocessor, like the i860 XR 
microprocessor, may set an fsr exception bit before 
the floating-point trap is generated, and this pending 
trap relies on information in the pipeline. For exam- 
ple, an external interrupt might invoke the trap han- 
dler between the scalar floating-point instruction that 
produces an overflow and the next floating-point op- 
eration — the one that would cause a branch to the 
trap handler for the floating-point trap. 

If no pending trap exists, the handler can follow ei- 
ther of the following two methods: 

• Using both PT and PI: Upon invocation, the trap 
handler saves the state of PI and PT (in epsr), 
but does not save the pipes. If PI is found set 
(which means that the interrupted code needs 
the state information currently in the floating- 
point pipelines), the handler sets PT and clears PI 
(with a single st.c to epsr instruction), then con- 
tinues with trap processing. If the pipes are used 
during trap handling (even by a scalar Instruc- 
tion), a trap will be generated with IT and PI set 
by hardware. The trap handler may then check PI 
and PT, and if both are set, clear PT, PI, and IT, 
save the pipes, set an indication that they were 
saved, and restart execution from the Instruction 
that caused the trap. At the end of trap handlirig, 
the trap handler restores the pipes If they were 
saved, and restores PI and PT to their values be- 
fore the trap. This method avoids both saving and 
restoring the pipes, assuming that most trap han- 
dling sequences do not alter the pipes, and there- 
fore a trap for PT = 1 will not happen very often. . 

• Using only PI: Another approach is to leave 
PT = 0, using only the PI bit, which the processor 
sets each time a pipelined instruction or pfid is 
encountered (even if the floating point instruction 
is suppressed due to KNF = 1). The trap handler 
saves PI, saves the pipes if PI is set, sets an indi- 
cation that they were saved, and clears PI. At the 
end of trap handling, the trap handler restores the 
pipes if they were saved, and restores PI to its 
value before the trap. With this method, the pipes 
are sometimes saved and restored unnecessarily 
if the trap handler code does not use the pipes. 
This method is advised when it is known that the 
trap handler uses the pipes. 

2.8.3 FLOATING-POINT FAULT 

The floating-point fault is reported on floating-point 
instructions, pst, fst, and sometimes fid, pfId, and 
ixfr. The floating-point faults of the i860 XP rhlcro- 
processor support the floating-point exceptions de- 
fined by the IEEE standard as well as some other 
useful classes of exceptions. The 1860 XP micro- 


processor divides these into two classes: source ex- 
ceptions and result exceptions. The numerics library 
supplied by Intel provides the IEEE standard default 
handling for ail these exceptions. 

2.8.3. 1 Source Exception Faults 

All exceptional operands, including infinities, denor- 
malized numbers and NaNs, cause a floating-point 
fault and set SE in the fsr. Source exceptions are 
reported on the instruction that initiates the opera- 
tion. For pipelined operations, the pipeline is not ad- 
vanced. 

SE is undefined for faults on fid, pfid, fst, pst, and 
ixfr instructions under these conditions: 

• In single-instruction mode, always. 

• In dual-instruction mode, when the companion In- 
struction is not a multiplier or adder operation. 

2.8.3.2 Result Exception Faults 

The result exceptions include: 

• Overflow. The absolute value of the rounded true 
result would exceed the largest positive finite 
number in the destination format. 

• Underflow (when FZ is clear). The absolute value 
of the rounded true result would be smaller than 
the smallest positive finite number in the destina- 
tion format. 

• Inexact result (when Tl is set). The result is not 
exactly representable in the destination format. 
For example, the fraction Va cannot be precisely 
represented in binary form. This exception occurs 
frequently and indicates that some (generally ac- 
ceptable) accuracy has been lost. 

The point at which a result exception is reported de- 
pends upon whether pipelined operations are being 
used: 

• Scalar (nonpipelined) operations. Result ex- 
ceptions are reported on the next floating-point, 
fst.x, or pst.x (and sometimes fid, pfid, ixfr) in- 
struction after the scalar operation. When a trap 
occurs, the last-stage of the affected unit con- 
tains the result of the scalar operation. 

• Pipelined operations. Result exceptions are re- 
ported when the result is in the last stage and the 
next floating-point (and sometimes fid, pfid, ixfr) 
instruction is executed. When a trap occurs, the 
pipeline is not advanced, and the last-stage re- 
sults (that caused the trap) remain unchanged. 

When no trap occurs (either because FTE is clear or 
because no exception occurred), the pipeline is ad- 
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vanced normally by the new floating-point operation. 
The result-status bits of the affected unit are unde- 
fined until the point that result exceptions are report- 
ed. At this point, the last-stage result-status bits (bits 
29..22 and 16.. 9 of the fsr) reflect the values in the 
last stages of both the adder and multiplier. For ex- 
ample, if the last-stage result in the multiplier has 
overflowed and a pfadd is started, a trap occurs and 
MO is set. 

For scalar operations, the RR bits of fsr report in 
which register the result was stored. RR is updated 
when the scalar instruction is initiated. The result ex- 
ception trap, however, occurs on a subsequent in- 
struction. Programmers must prevent intervening 
stores to fsr from modifying the RR bits. Prevention 
may take one of the following forms: 

• Before any store to fsr when a result exception 
may be pending, execute a dummy floating-point 
operation to trigger the result-exception trap. 

• Always read from fsr before storing to it, and 
mask updates so that the RR bits are not 
changed. 

For pipelined operations, RR is cleared; the result is 
in the last stage of the pipeline of the appropriate 
unit. The trap handler must flush the pipeline, saving 
the results and the status bits. 

In either pipelined or scalar mode, the trap handler 
must compute the result to be returned. In either 
case, the result delivered by the CPU has the same 
significand as the true result and has an exponent 
that is the low-order bits of the true result. The trap 
handler can inspect the delivered result, compute 
the result appropriate for that instruction (a NaN or 
an Infinity, for example), and store the computed re- 
sult. If RR is nonzero, the trap handler must store 
the computed result in the register specified by RR; 
if RR is zero, it must load the last stage of the pipe- 
line with the computed result instead of the saved 
result. 

Result exceptions may be reported for both the ad- 
der and multiplier at the same time. In this case, the 
trap handler should fix up the last stage of both pipe- 
lines. 


2.8.4 INSTRUCTION ACCESS FAULT 

This trap occurs during address translation for in- 
struction fetches in any of these cases: 

• The address fetched is in a page whose P (pres- 
ent) bit in the page table is clear (not present). 


• The address fetched is in a supervisor mode 
page, but the processor Is In user mode. 

• The address fetched is In a page whose PTE has 
A = 0, and the access occurs during a locked 
sequence (I.e. between lock and unlock). 


Note that several instructions are fetched at one 
time, either due to instruction prefetching or to in- 
struction caching. Therefore, a trap handler can 
change from supervisor to user mode and continue 
to execute instructions fetched from a supervisor 
page. An instruction access trap occurs only when 
the next group of instructions is fetched from a su- 
pervisor page (up to eight instructions later). If, in the 
meantime, the handler branches to a user page, no 
instruction access trap occurs. No protection viola- 
tion results, because the processor does not permit 
data accesses to supervisor pages while running in 
user mode. 



2.8.5 DATA ACCESS FAULT 

This trap results from an abnormal condition detect- 
ed during data operand fetch or store. Such an ex- 
ception can be due only to one of the following caus- 
es: 

• An attempt is being made to write to a page 
whose D (dirty) bit Is clear. 

• A memory operand is misaligned (is not located 
at an address that is a multiple of the length of 
the data). 

• The address stored in the debug register is equal 
to one of the addresses spanned by the operand. 

• The operand Is In a not-present page. 

• An attempt is being made from user level to write 
to a read-only page or to access a supervisor-lev- 
el page. 

• The operand is in a page whose PTE has A = 0, 
and the access occurs during a locked sequence 
(i.e. between lock and unlock). 

• Write protection (determined by epsr bit WP = 1 ) 
is violated in supervisor mode. 

When a data access trap is taken on a pipelined 
floating-point instruction that occurs immediately af- 
ter the load or store Instruction that causes the trap, 
the destination register of the pipelined floating-point 
instruction may be partially updated. Correct execu- 
tion will occur when the trap handler resumes execu- 
tion after handling the DAT, because the pipelined 
floating-point Instruction will then correctly update its 
destination register. 
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2.8.6 PARITY ERROR TRAP 

If the PEN# pin is active and the bus unit detects a 
parity error during a bus read operation, the proces- 
sor sets PEF and IN, then generates a trap. Further 
parity error traps are masked as soon as PEF is set. 
To reenable such traps, software must clear PEF 
and unfreeze BEAR by executing Id.c bear, rdest. 

The interrupted program is not restartable. BS (bus 
or parity error trap in supervisor mode) is set by the 
i860 XP microprocessor when a parity error occurs 
while the processor is in supervisor mode. The oper- 
ating system can use this bit to decide, for example, 
whether to abort the process (user mode) or reboot 
the system (supervisor mode). 

2.8.7 BUS ERROR TRAP 

When external hardware asserts the BERR pin, the 
processor sets BEF (bus error flag) and IN (inter- 
rupt;, and then traps. Further BERR traps are 
masked as soon as BEF is set by hardware. To 
reenable such traps, software must clear BEF and 
unfreeze BEAR by executing Id.c bear, rdest. 

BS (bus or parity error trap in supervisor mode) Is set 
by the i860 XP microprocessor when a bus error oc- 
curs while the processor is in supervisor mode. The 
operating system can use this bit to decide, for ex- 
ample, whether to abort the process (user mode) or 
reboot the system (supervisor mode). 

2.8.8 INTERRUPT TRAP 

An interrupt is an event that is signaled from an ex- 
ternal source. If the processor is executing with in- 
terrupts enabled (IM set In the psr), the processor 
sets the interrupt bit IN in the psr and INT in the 
epsr, then generates an Interrupt trap. 

Vectored interrupts are implemented by interrupt 
controllers and software. Software can use the Idint 
instruction to generate an interrupt acknowledge 
(INTA) cycle. This instruction generates a bus cycle 
with INTA cycle specifications, and places the data 
returned from the bus to the destination register. 
Tags are not checked In the data cache for hit, and 
the cycle is not burstable. 

The Intel 486 microprocessor generates two INTA 
cycles as a response to an interrupt and inserts four 
idle clocks in between. To generate an Interrupt ac- 
knowledge sequence that Is compatible with the 
Intel 486 microprocessor, the Idint instruction se- 
quence documented in section 5.1.4 should be exe- 
cuted. 


2.8.9 RESET TRAP 

When the i860 XP microprocessor is reset, execu- 
tion begins in single-instruction mode at virtual ad- 
dress OxFFFFFFOO. This Is the same address as for 
other traps. The reset trap can be distinguished from 
other traps by the fact that no trap bits are set. The 
instruction cache Is flushed. The bits DPS, BL, and 
ATE in dirbase are cleared. CS8 is initialized by the 
value at the INT pin at the end of reset. The read- 
only fields of the epsr are set to identify the proces- 
sor, while the IL, WP, and PBM bits are cleared. The 
bits U, IM, BR, and BW in psr are cleared, as are the 
trap bits FT, DAT, lAT, IN, and IT. All other bits of 
psr and all other register contents are undefined. 
Refer to Table 2.11 for a summary of these initial 
settings. 

The software must ensure that the control registers 
are properly Initialized before performing operations 
that depend on the values of those registers. 

Reset code must initialize the floating-point pipeline 
state to zero with floating-point traps disabled to en- 
sure that no spurious floating-point traps are gener- 
ated. 

After a RESET the i860 XP microprocessor starts 
execution at supervisor level (U = 0). Before branch- 
ing to the first user-level instruction, the RESET trap 
handler or subsequent initialization code has to set 
PU and a trap bit so that an indirect branch instruc- 
tion will copy PU to U, thereby changing to user lev- 
el. 


2.9 Debugging 

The i860 XP microprocessor supports debugging 
with both data and instruction breakpoints. The fea- 
tures of the I860 XP microprocessor architecture 
that support debugging include: 

• db (data breakpoint register), which permits 
specification of a data address that the I860 XP 
microprocessor will monitor. 

• BR (break read) and BW (break write) bits of the 
psr, which enable trapping of either reads or 
writes (respectively) to the address In db. 

• DAT (data access trap) bit of the psr, which al- 
lows the trap handler to determine when a data 
breakpoint was the cause of the trap. 

• trap instruction that can be used to set break- 
points in code. Any number of code breakpoints 
can be set. The values of the isrd and isrc2 
fields help identify which breakpoint has oc- 
curred. 

• IT (Instruction trap) bit of the psr, which allows 
the trap handler to determine when a trap 
instruction was the cause of the trap. 
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Table 2.11. Register and Cache Values after Reset 


Registers 

initial Vaule 

Integer Registers 

Undefined 

Floating-Point Registers 

Undefined 

psr 

U, IM, BR. BW, FT, DAT, lAT, IN, IT = 0; 
others are undefined 

epsr 

IL, WP, PBM, BE, PT = 0; BEF, PEF = 1; 

Processor Type, Stepping Number, DCS, 


SO are read only; others are undefined 

db 

Undefined 

dirbase 

DPS, BL, LB, ATE = 0; others are undefined 

fir 

Undefined 

fsr 

Undefined 

bear 

Undefined 

p3-p0 

Undefined 

ccr 

CO, DO = 0; others are undefined 

KR, KI,T, MERGE 

Undefined 

NEWCURR 

Undefined 

STATUS 

InLoop, Nested, Detached = 0 

Caches 

Initial Value 

Instruction Cache 

All entries invalid 

Data Cache 

All entries invalid 

TLB 

All entries Invalid 


3.0 ON-CHIP CACHES 

By holding data, instructions, and address transla- 
tion on-chip, the caches of the i860 XP microproces- 
sor provide the following advantages: 

1 . Low chip count for the CPU subsystem. 

2. Wide processor-to-cache path: 1 6 bytes for data, 
8 bytes for instructions. 

3. Fast access without requiring much additional 
high-speed design in the system. The fast 
(50 MHz) cache-access circuitry is hidden on 
chip; the external bus can respond more slowly 
without significantly degrading performance. 


3. 1 Address T ranslation Caches 

The i860 XP microprocessor allows both four Kbyte 
and four Mbyte page sizes, and a separate transla- 
tion look-aside buffer (TLB) is used to cache ad- 
dress translation information for each page size. The 
TLB for four-Kbyte pages (Figure 3.1) has 64 entries, 
and the TLB for four-Mbyte pages (Figure 3.2) has 
16 entries. Both are four-way set associative. The 
TLBs function when paging is enabled. When a page 
is first accessed, its translation information is saved 
in the appropriate TLB along with other page attri- 
butes, such as access rights and cacheability. Every 
address translation operation looks up the virtual ad- 
dress simultaneously in both TLBs. Only if the nec- 


essary paging information is not in either of the 
caches must the paging tables in memory be refer- 
enced. Both TLBs employ a random replacement al- 
gorithm to choose which of the four ways to replace. 

If an instruction’s virtual address is found in the in- 
struction cache, the virtual address is not translated, 
and code access rights are not verified. However, 
when an instruction’s virtual address is not found in 
the cache, address translation does occur, and all 
access rights are verified. The virtual addresses of 
data are always translated, and access rights are 
always verified. 

The i860 XP microprocessor requires simultaneous 
access to data and instruction caches, but the TLBs 
can service only one address translation at a time. 
Data address translation has higher priority in the 
TLBs than instruction address translation. If both are 
required at the same time. 

Any data or Instruction access fault halts address 
translation at once, and the TLB is not updated. If a 
directory read causes an access fault, the page ta- 
ble is not read at all. 

If the paging unit generates a fault (in setting the D 
bit for the first write to a nondirty page, for example), 
the corresponding entry Is deleted from the TLB. 
Therefore, software does not need to invalidate the 
TLB entry In response to DAT or lAT faults. 
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Figure 3,1. 4K TLB Organization 
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Figure 3.2. 4M TLB Organization 
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If TLB replacement is initiated during a locked se- 
quence generated by the lock instruction and if an- 
other locked sequence has to be executed to set the 
A-bit, the paging unit generates an access fault. This 
helps external hardware implement “locking by ad- 
dress” by preventing generation of nested lock se- 
quences. 


3.2 Internal Instruction and Data 
Caches 

The I860 XP microprocessor has separate data and 
instruction caches on-chip. Having separate caches 
for instructions and data allows simultaneous cache 
look-up. Up to two Instructions and 128 bits of data 
can be accessed simultaneously from these caches. 
The data and instruction caches hold 16 Kbytes 
each. A line can be filled from memory with a four- 
transfer burst. 

The caches are fully transparent to applications soft- 
ware. Snooping (address monitoring) is designed 
into both instruction and data caches, to maintain 
cache consistency In multiprocesspr systems. 

Each cache has two sets of tags: virtual tags used 
for internal access, and physical tags used for 


snooping. Figure 3.3 shows how the bits of both vir- 
tual and physical addresses are mapped for cach- 
ing. The presence of both virtual and physical tags 
supports aliasing, a situation In which the TLBs as- 
sociate a single physical address with two or more 
virtual addresses. 


Any area of memory can be cached, although both 
software and hardware can disallow certain areas 
from being cached — software by setting the CD bit in 
their page table entries; hardware by deasserting the 
KEN# signal for bus cycles with addresses that fall 
in those areas. (Data reads from the two four-Kbyte 
pages pointed to by the CCUBASE field of ccr are 
not cached (and the CACHE# signal is inactive), if 
the DCCU is activated by setting CO of the ccr 
register. This Is Independent of the value of KEN#.) 
When both software and hardware agree that a re- 
quested datum is cacheable, the i860 XP microproc- 
essor fetches an entire 32-byte line and places It 
into the appropriate cache. Cache line fills are gen- 
erated only for read misses, not for write misses. A 
store that misses the cache does not copy the 
missed line into cache from memory, but rather 
posts the datum in a write buffer, then sends It to the 
external bus when the bus is available. 



INTERNALLY GENERATED ADDRESSES 
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Figure 3.3. Cache Address Usage 
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3.2.1 DATA CACHE 

Figure 3.4 shows the organization of the data cache. 
The data cache has two status bits per physical tag 
and one validity status bit for the virtual tag. A virtual 
tag hit is possible only when the validity bit of the 
virtual tag is set and the state of the physical tag is 
M, E, or S. 

Aliasing support is built Into the cache look-up algo- 
rithm. Even though a physical line may be aliased, 
the processor never enters the line twice in the data 
cache. If a virtual address is not found among the 
virtual tags in the data cache, a bus cycle is initiated 
(except a read is not issued at this time if the bus 
pipeline is full) and, at the same time, the physical 
tags are searched for the physical address (which by 
this time has been retrieved from the paging unit). 
For reads, if the physical address is found, the data 
returned from the bus is Ignored, on-chip data is 
used, and the virtual tag Is replaced with the new 
one. For writfis. if a virtual address is net fouPid, the 
write is issued on the bus and memory is updated. If 
the physical address is found, the line in cache is 
updated, and the virtual tag Is replaced with the new 
one. However, the cache state (M, E, or S) of the 
physical-address tag does not change when the vir- 
tual tag is overwritten. 

Note that the BE (big endian) bit of epsr has no 
influence on data cache behavior. Data Items are 
kept In cache in exactly the same ordering as in ex- 
ternal memory. Byte-shifting operations invoked by 
the BE bit upon loads and stores occur at the input 
to the register files only. 

3.2. 1.1 Data Cache Update Policies 

To minimize bus traffic, a write-back poiicy is normal- 
iy used. The write-back policy (also called copy-back 
and deferred-write) reduces bus traffic by eliminating 


many unnecessary writes. Writes to a line In the 
cache are not Immediately forwarded to main mem- 
ory; Instead, they are accumulated in the cache. The 
modified cache line is written to main memory only 
when its cache space is needed for other data, 
when the modified data is needed by another proc- 
essor, or when a flush procedure is executed. 

Under the write-back policy, a write that hits the 
cache utilizes It for two cycles (one to check the 
virtual tags for hit, another to update the cache line). 
However, the cache pipeline allows successive 
store hits to operate at one per cycle. The proces- 
sor’s Internal write buffers can hold two successive 
stores, preventing a freeze upon store miss. 

Under a write-through policy, a write request to a line 
in the cache triggers updates to both cache and 
main memory. An address decoder, for example, 
can select the write-through policy for writes to video 
RAM, where It is necessary that writes be seen on 
me vluwu display. Software, by setting the WT page- 
table bit, can select the write-through policy for spe- 
cific areas of memory — ^those that are used for inter- 
processor message queues, for example. 

A write-once policy combines write-through with 
write-back. Write-through is employed for the first 
write to a cache line, while subsequent writes to the 
same line follow the write-back policy. Write-once is 
valuable in multiprocessor systems to maintain 
cache consistency with the least possible bus traffic. 
The first write broadcasts to other processor nodes 
the fact that a line has been modified. Write-once Is 
also used if a second-level cache is attached to the 
i860 XP microprocessor to maintain consistency be- 
tween the first- and second-level caches. 

The external system can dynamically change the up- 
date policy (write-back, write-through, write-once) of 
the i860 XP microprocessor with each cache line. 


NOTES: 

M Modified 
E Exclusive 
S Shared 
i invalid 
V Validity 
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Figure 3.4. Data Cache Organization 
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3.2.2 INSTRUCTION CACHE 

Figure 3.5 shows the organization of the instruction 
cache. The instruction cache has one validity bit that 
is common to both virtual and physical tags. Aliasing 
support for instructions consists not simply of chang- 
ing the virtual tag, but rather fetching a line whenev- 
er a virtual tag miss occurs. If the physical address 
already exists In the instruction cache, its line and its 
tags are overwritten. So, even though a physical line 
may be aliased, the processor never enters the line 
twice in the instruction cache. 


Setting ITI to invalidate the caches and TLBs also 
resets the counters used to select the set used for 
cache line replacement. This brings the i860 XP mi- 
croprocessor cache-replacement mechanism to a 
known state without resetting the whole chip. 

When the flush Instruction Is used to write back 
modified lines In the data cache, the flush routine 
must alter the RC (replacement control) field of 
dirbase. Therefore, replacement Is not random. In- 
stead, the block (or “way”) replaced Is the one se- 
lected by the RB (replacement block) field of 
dirbase. 


3.2.3 CACHE REPLACEMENT ALGORITHM 

The data. Instruction, and address-translation 
caches all use similar algorithms to choose which of 
the four cache blocks will be overwritten when a 
miss causes a line fetch. 


3.2.4 CACHE CONSISTENCY PROTOCOL 


The I860TM xP Microprocessor implements cache 
consistency via its use of a MESI (Modified, Exclu- 
sive, Shared, Invalid) protocol. 



First, the first invalid line (if any) in a set of four is 
replaced (In the order 0, 1,2, 3). When there are no 
more Invalid lines In a set, a pseudorandom replace- 
ment algorithm chooses which valid lines to replace. 
The algorithm is controlled by counters inside the 
chip. RESET initializes these counters to zero, so 
that the “randomness” is deterministic and two 
i860 XP CPUs executing the same code on identical 
boards have, exactly the same series of cache hits, 
misses, and replacements. 


3.2.4. 1 Data Cache States 

Each line of the data cache of the i860 XP micro- 
processor can be In one of the states defined in Ta- 
ble 3.1. Note that the Instruction cache of the 
I860 XP only implements the “SI” part of the MESI 
protocol, because the Instruction cache is not writa- 
ble. 


NOTE: 

V Validity 



240874-26 


Figure 3.5. Instruction Cache Organization 


Tabie 3.1. MESI Cache Line States 


Cache Line State: 

M 

Modified 

E 

Exclusive 

S 

Shared 

1 

Invaiid 

This cache line Is valid? 

Yes 

Yes 

Yes 

No 

The memory copy is . . . 

... out of date 

. . . valid 

. . . valid 

— 

Copies exist in other caches? 

No 

No 

Maybe 

Maybe 

A write to this line . . . 

. . . does not go 
to bus 

. . . does not go 
to bus 

. . . goes to bus 
and updates 
the cache 

. . . goes 
directly to bus 
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Table 3.2. Internally Initiated Cache State Transitions 


state 

Next State after Read 

Next State after Write* 

I 

If WB/WT# -1;E; else S 

Line fill 

Write-through 

I 

s 

S 

Write-through 

If WB/WT# = 1,E; elseS 

E 

E 

M 

M 

M 

M 


NOTE: 

* “Write" does not include write-backs due to replacement. Those can only cause an M to I 
transition. 


The state of a cache line can change as the result of 
either internal or external activity related to that line. 
Table 3.2 presents the line state transitions that re- 
sult from internal activity of the i860 XP microproces- 
sor in the data cache. 

External cache-consistency support is provided 
linuuyli inquiry cycies. inquiry cycles are initiated by 
other processors in a multiprocessor system to 
check whether. an address is cached in the internal 
cache of the i860 XP microprocessor. Table 3.3 
shows the line state transitions initiated by inquiry 
cycles. 


Table 3.3. Inquiry-Initiated 
Cache State Transitions 


state 

INV = 0 

INV=1 

I 

s 

I 

S 

1 

1 

E 

s 

1 

M 

S; write back the line 

1; write back the line 


3.2.4.2 Write-Once Policy 

A write-once cache policy can be implemented 
through use of the WB/WT # Input pin. The signal 
on this pin is sampled In both read and write cycles. 
A read miss causes a line to enter either S or E after 
the line fill. If WB/WT # is sampled LOW at the time 
of NA# or the first BRDY# activation, the line en- 
ters S state, forcing the next write hit to this line to 
show up on the bus. If WB/WT # Is sampled HIGH, 
the line enters E state. In write-through cycles, the 
state of a line Is changed from S to E when WB/ 
WT# is sampled HIGH, so that subsequent writes 
will not be written through to the bus. Thus, if this 
signal is driven LOW on read cycles and HIGH on 
write cycles, a write-once cache policy is implement- 
ed. The easiest way to implement write-once (in sys- 
tems not using the 82495XP cache controller) is to 
tie this pin to the W/R# output of the processor. 


If the WT bit in the page table entry is set, the 
i860 XP microprocessor ignores the WB/WT # sig- 
nal for the cycles that hit that page and always per- 
forms a write-through. In other words, hardware can- 
not override software’s selection of the write- 
through policy. 

3.2.4.3 Locked Access 

Locked accesses are those data loads and stores 
that occur after a lock instruction up to and including 
the first load or store after the corresponding unlock 
instruction. 

State transitions for locked accesses differ from 
those in Table 3.2 in ways that guarantee that 
locked accesses are seen by all processors in the 
system. Any locked load or store generates both a 
cache look-up and an external bus cycle, regardless 
of cache hit or miss. 

1. In a locked read: 

a. If the required data is not found in the cache, 
the data from the bus is used. The data is 
placed in the cache if it is cacheable and 
KEN # is also asserted. 

b. If the required data is found in an unmodified 
(E or S) state, the data from the bus is used. 

c. If the data is found In the cache in a modified 
(M) state, the cached data is used, and the 
bus data is ignored, as long as no inquiry 
write-back occurs before the BRDY# of the 
bus cycle. If, however, an intervening inquiry 
write-back changes the line to S or I state, the 
bus data is used. 

2. A locked store is forced through the cache and 
issued on the bus. No more data accesses occur 
until the last BRDY# for the store. If the store 
hits the internal cache, the cache update is done 
after the last BRDY# from the bus. Note that the 
line written by a locked store remains in M state 
in spite of the write-through to the bus, because 
the length of the write-through is less than the 
line size of 32 bytes. 
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Locked accesses are totally serializing in the sense 
that: 

1. All loads and stores that precede the lock 
instruction are issued on the bus (if they miss the 
cache) before the first locked access is issued. 
The locked access can be issued before the last 
BRDY# of the prior cycle if NA# is activated in 
response to the prior cycle. 

2. No load or store after the last locked access is 
issued internally or on the bus until the final 
BRDY# for all locked accesses. 

To maximize performance, instruction fetches during 
the locked sequence are not serializing. When NA# 
invokes pipelining, instruction fetches may be issued 
while locked data fetches or stores remain on the 
bus. 


3.3 Internal Cache Consistency 

Both the instruction and the data caches can be 
snooped by externally generated inquiry cycles, and 
the result of the look-up is presented on the HIT# 
and HITM# output pins. These inquiry cycles help 
maintain consistency with caches of other proces- 
sors. However, software must take care not to cre- 
ate inconsistencies such as the following among the 
internal caches (including the TLBs): 

1 . Changing the address space while leaving virtual- 
address tags from the prior space in the instruc- 
tion or data cache. 

2. Changing instructions in memory (or in the data 
cache) without changing them in the instruction 
cache. 

3. Changing page table information in memory (or in 
the data cache) without changing the same infor- 
mation in the TLBs. 

Under certain circumstances, such as I/O refer- 
ences, self-modifying code, page-table updates, or 
shared data in a multiprocessing system, it is neces- 
sary to bypass, to invalidate, or to flush the caches. 
The i860 XP microprocessor provides the following 
methods for doing this: 

• Bypassing Instruction and Data Caches. 

1. If deasserted during cache-miss processing, 
the KEN# pin disables instruction and data 
caching of the referenced data. 

2. If the CD bit of the associated page table is 
set, caching of a page is disabled. The value of 
the CD bit is output on the PCD pin for use by 
external caches. 


3. If the WT bit of the associated page table is 
set, caching is not disabled, but writes pass 
through the cache. The value of the WT bit Is 
output on the PWT pin for use by external 
caches. (Note that WT does not affect policy 
for the instruction cache, because the instruc- 
tion cache is not writable. However, when an 
instruction from a page having the WT bit of 
the PTE set is placed in the data cache, the 
write-through policy applies just as for a data 
page.) 


9 Invalidating Cache Entries. Storing to the 
dirbase register with the ITI bit set invalidates 
each line of the Instruction and address-transla- 
tion caches. In the data cache, it invalidates the 
virtual tags, but not the physical tags. 

• Flushing the Data Cache. The data cache Is 
flushed by a software routine that uses the flush 
instruction. The flush instruction speeds up write- 
backs. The same effect (writing back modified 
lines) can be achieved with the load Instruction 
Id.l, but this would be more than twice as slow — 
the load must first do four bus transfers to get 
new data, then write back the modified line. The 
flush instruction causes the write-backs without 
requiring a read from external memory to replace 
the modified line. 



3.3.1 ADDRESS SPACE CONSISTENCY 

In a multitasking virtual-address system, the operat- 
ing system may intentionally employ aliasing, where 
several processes use the same physical memory 
while accessing it with different virtual addresses. 
When the operating system switches control from 
one process to the next, it changes the DTB field of 
the dirbase to point to a different page directory that 
defines the new address space. When this happens, 
all caches must be Invalidated: the TLBs, so that the 
new page directory is read into the TLBs; the data 
and instruction caches, so that virtual addresses 
from the new space don’t accidently match cached 
virtual addresses from the old space. 

The caches are invalidated by setting the ITI bit 
when writing to dirbase. Invalidating the Instruction 
cache invalidates both the physical and the virtual 
tags, because the instruction cache has one status 
(valid) bit, which Is common to both physical and 
virtual tags. In the data cache, setting ITI does not 
invalidate physical tags. However, any modified lines 
will eventually be written back when their space is 
required for lines from the new address space or 
when external agents on the bus express a need for 
the modified data via inquiry cycles. 
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The caches are invalidated by setting the IT! bit 
when writing to dirbase. Note, however, that the op- 
erating system code that flushes the caches must 
be present during the flushing. Typically this code 
has the same virtual address for all processes. 

NOTE: 

The mapping of the page(s) containing the cur- 
rently executing instruction, the next six In- 
structions, and any data referenced by these 
instructions should not be different in the new 
page tables when the DTB is changed. 

Enabling or disabling address translation (via the 
ATE bit) is similar to changing the DTB, in that the 
address mapping Is changed. The virtual tags in the 
data and instruction cache must be invalidated prior 
to changing ATE. 

3.3.2 INSTRUCTION CACHE CONSISTENCY 

When Gcftvvarc modifies a page uunlaininy instruc- 
tions (as when a debugger replaces an instruction 
with the trap instruction to set a breakpoint), the in- 
struction cache can become inconsistent for any of 
the following reasons: 

• Because the data cache uses a write-back policy, 
changes to cached instruction pages do not Im- 
mediately update memory. 

• Changes to instructions do not automatically up- 
date the Instruction cache. 

• Instruction cache misses are not checked in the 
data cache. 

Software must ensure that modified lines containing 
instructions are written to main memory before the 
instruction cache tries to read them. There are two 
methods for this: 

1 . Flush the data cache using the flush Instruction. 
Note that to make the instruction cache consist- 
ent with the data cache, the data cache must be 
flushed before invalidating the instruction cache. 

2. Mark all instruction pages as WT (write through) 
so that modifications to instructions are immedi- 
ately written to memory. This is the better alterna- 
tive. 

In either case, the instruction cache must be invali- 
dated (by a store to dirbase with ITI set) after a 
code page has been modified, so that the updated 
instructions will be read from memory. 


3.3.3 PAGE TABLE CONSISTENCY 

When the operating system modifies page tables or 
directories, the TLBs can become inconsistent with 
the modifications for any of the following reasons: 

• Because the data cache uses a write-back policy, 
updates to cached page tables do not immediate- 
ly update memory. 

• Changes to page tables do not automatically up- 
date the TLB. 

• The i860 XP microprocessor searches only exter- 
nal memory for page directories and page tables 
in the translation process. The data cache Is not 
searched. (Data is not transferred from the data 
cache to the TLBs during TLB replacement cy- 
cles.) 

Software must ensure that modified lines containing 
page table entries are written to main memory be- 
fore the paging unit tries to read them. There are two 

mothr*He tz-kr ■Jj'jj*'; 

1 . Keep page tables and directories in noncachea- 
ble memory or write-through pages. 

2. Flush the data cache using the flush instruction. 

The processor itself invalidates the affected TLB en- 
try, when a trap is triggered by the need to set the A 
or D bit. In other cases, after a page table or directo- 
ry has been modified, software must invalidate the 
TLBs (by a store to dirbase with ITI set) so that the 
updated entries will be read from memory. 

The data cache does not need flushing if the pro- 
gram Is modifying only the P, U, W, A, or D bits of a 
PTE (as long as the page frame address Is not 
changed and the PTE itself is not in the data cache.) 
The i860 XP CPU does not use the TLB for cache 
line write-backs; it writes to the address in the physi- 
cal tag. 

Thus, a trap handler can service a data access trap 
for D-bit zero merely by setting D = 1 . When setting 
the P or A bits, there Is no need to invalidate or flush 
any caches, because the processor does not load 
entries Into the TLB that have P = 0 or A = 0. 

Two potential TLB inconsistencies are avoided auto- 
matically by the i860 XP microprocessor. 

1 . If the paging unit issues a write cycle (to set the A 
bit, for example), this cycle is snooped by the 
data cache for Invalidation. 

2. Any TLB entry that causes a DAT or lAT Is auto- 
matically invalidated. 
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3.3.4 CONSISTENCY OF CACHEABILITY 3.3.6 SUMMARY 


Normally, an operating system ensures that the 
page attributes (CD and WT) of a memory access 
are consistent with the cache contents. However, 
the operating system can fail to maintain consisten- 
cy by the following actions: 

• Changing the CD or WT bits while related lines 
are in the cache. 

• Aliasing a physical address with virtual addresses 
that have differing CD or WT bits. 

In these situations, the i860 XP microprocessor 
gives priority to cache state. For example: 

1. If a read or write request is to a noncacheable 
page (CD = 1), but the data (or code) is found in 
cache, the request is satisfied by the cache, and 
no external cycle is issued. 

2. If the physical address of a read or write request 
hits in the cache but the virtual address misses, 
the virtual tag is overwritten by the new virtual 
address, but the CD bit of the new virtual address 
is ignored. 

3. If a store to a write-through page (WT = 1 ) hits a 
cache line in E or M state, no write-through cycle 
is issued; only the cache Is updated. 

3.3.5 LOAD PIPE CONSISTENCY 

The pfid (pipelined floating-point load) instruction fa- 
cilitates transfer of data from memory to registers, 
and avoids placing data in the data cache. When 
large amounts of data are used, pfId allows the pro- 
grammer to keep rarely-used data out of the cache. 
The i860 XP microprocessor ensures consistency 
between cached data and pfid references. It checks 
the data cache and, upon a data cache hit to a modi- 
fied line, forwards data from cache into the three- 
stage pfid pipeline. 


Table 3.4 summarizes flush and invalidation require- 
ments, assuming that WT is set in the PTEs of in- 
struction and page-table pages: 

Table 3.4. Summary of 


Cache Flushing And Invalidation 


Action 

Flush 

Data 

Cache 

Invalidate 

Caches 

(ITI) 

Setting A 

No 

No 

Setting P 

No 

No 

Clearing P 

No 

Yes 

Setting D 

No 

No 

Changing protection (U,W) 

No 

Yes 

Setting CD or WT 

Yes 

Yes 

Changing PFA In a used(i) PTE 

No 

Yes 

Changing dirbase DTB 

No 

Yes 

Changing dirbase ATE 

No 

Yes 

Changing epsr WP 

No 

No 

Setting ccr DO and CO 

Yes(2) 

Yes(2) 

Modifying code 

No(3) 

Yes 


NOTES: 

1. “Used” means a PTE that at some past time had P set. 

2. If data from either of the CCU pages could have been 
cached. 

3. Assuming all instructions and their page directories and 
page tables are in write-through or noncacheable pages. 


4.0 HARDWARE INTERFACE 

In the following description of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no # is present after 
the signal name, the signal is asserted when at the 
high voltage level. 


4.1 Pins Overview 

Figure 4.1 identifies functional groupings of the pins. 
Table 4.1 lists every pin by its identifier, gives a brief 
description of its function, and lists some of its char- 
acteristics. All output pins are tristate, except BREQ, 
HIT#, HITM#, HLDA, LOCK#, and PCHK#. 
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Table 4.1. Pin Summary 


Pin 

ID 

Name 

Active 

Level 

When Floated 
Synch/Asynch 

Internal 

Resistor 

Output Pins 

ADS# 

Address Status 

LOW 

HLDA, clock after BOFF# 


BE7#-BE0# 

Byte Enable 

LOW 

HLDA, BOFF# 


BREQ 

Bus Request 

HIGH 



CACHE# 

Cache 

LOW 

HLDA, BOFF# 


CTYP 

Cycle Type 

HIGH 

HLDA, BOFF# 


D/C# 

Data/Code 


HLDA, BOFF# 


HIT# 

Snoop Hit Cache 

LOW 



HUM# 

Snoop Hit Modified Line 

LOW 



HLDA 

Hold Acknowledge 

HIGH 



KB0,KB1 

Cache Block 

HIGH 

HLDA, BOFF# 


LEN 

Length 

HIGH 

HLDA, BOFF# 


LOCK# 

Address Lock 

LOW 



M/IO# 

Memory/IO 


HLDA, BOFF# 


NENE# 

Next Near 

LOW 

HLDA, BOFF# 

• 

PCD 

Page Cache Disable 

HIGH 

HLDA, BOFF# 


PCHK# 

Parity Check 

LOW 



PCYC 

Page Cycle 

HIGH 

HLDA, BOFF# 


PWT 

Page Write-Through 

HIGH 

HLDA, BOFF# 


TDO 

Test Output 


Nonscan Mode 


W/R# 

Write/ Read 


HLDA, BOFF# 


Input/Output Pins 

A31 -A3 

Address 

HIGH 

AHOLD, HLDA, BOFF# 


D63-D0 

Data 

HIGH 

HLDA, BOFF# 


DP7-DP0 

Data Parity 

HIGH 

HLDA, BOFF# 


Input Pins 

AHOLD 

Address Hold 

HIGH 

Synch 


BERR 

Bus Error 

HIGH 

Synch 


BOFF# 

Back-Off 

LOW 

Synch 


RSRVD# 

Intel Reserved 




BRDY# 

Burst Ready 

LOW 

Synch 


BYPASS# 

Intel Reserved 

LOW 



CLK 

Clock 




RESET 

Reset 

HIGH 

Asynch 


EADS# 

External Address Status 

LOW 

Synch 


EWBE# 

External Write Buffer Empty 

LOW 

Synch 


FLINE# 

Flush Line 

LOW 

Synch 


HOLD 

Bus Hold 

HIGH 

Synch 


INT/CS8 

Interrupt/Code-Size 8 

HIGH 

Asynch 


INV 

Invalidate 

HIGH 

Synch 


KEN# 

Cache Enable 

LOW 

Synch 


NA# 

Next Address 

LOW 

Synch 


PEN# 

Parity Enable 

LOW 

Synch 


TCK 

Test Clock 




TDI 

Test Data Input 


Synch 

Pull-up 

TMS 

Test Mode Select 


Synch 

Pull-up 

TRST# 

Test Reset 

LOW 

Asynch 

Pull-up 

WB/WT# 

Write-Back/Write-Through 


Synch 


SPARE 

Intel Reserved 
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The pins D/C#, W/R#, and M/IO# define bus cy- 
cle types. They are summarized in Table 4.2. For 
data transfers to or from memory, two additional 
pins, CTYP and PCYC, provide further information 
regarding the type of transfer, as shown in Table 4.3. 
Table 4.4 shows how the LEN and CACHE# pins 
determine cycle length. 

Table 4.2. ADS# Initiated Bus Cycle Definitions 


Table 4.3. Memory Data Transfer Cycle Types 


NOTE: 

PCYC and CTYP are defined only for memory data transfer 
cycles (D/C# = 1, M/IO# = 1) 


Table 4.4. Cycle Length Definition 


W/R# 

LEN 

CACHE# 

KEN# 

Cycle Description 

Burst Length 

0 

0 

1 

— 

Noncacheable** 64-blt (or less) read 

1 

0 

0 

— 

1 

Noncacheable 64-blt (or less) read 

1 

1 

0 

, 1 


64-bit (or less) write 

1 

— 

0 

1 

— 

I/O and Special Cycles 

1 

0 

1 

1 

— 

Noncacheable 1 28-bit read (p)f Id.q 

2 

0 

1 

— 

1 

Noncacheable 1 28-bit read (p)f Id.q 

2 

1 

1 

1 

— 

128-bit write fst.q 

2 

0 

— 

0 

0 

Cache line fill 

4 

1 

— 

0 

— 

Cache write-back 

4 


PCYC 

CTYP 

W/R# 

Data Transfer Type 

0 

0 

0 

Normal read 

0 

1 

0 

Pipelined load (pfid Instruction) 

1 

0 

0 

Page directory read ' 

1 

1 

0 

Page table read 

0 

0 

1 

Write-through (S-state hit) 

0 

1 

1 

Store miss or write-back 

1 

0 

1 

Page directory update 

1 

1 

1 

Page table update 


M/IO# 

D/C# 

W/R# 

Bus Cycle Initiated 

0 

0 

0 

Interrupt Acknowledge 

0 

0 

1 

Special Cycle 

0 

1 

0 

I/O Read 

0 

1 

1 

I/O Write 

1 

0 

0 

Code Read 

1 

0 

1 

Reserved 

1 

1 

0 

Memory Read 

1 

1 

1 

Memory Write 


^D63-D 


^DP7-DP^ 


BRDY# 


KEN# 


WB/WT# 


AHOLD 


INV 


HOLD 


INT/CS8 


BERR 


TDI 


TMS 


TRST# 


RESET 


CLK 


EWBE# 


BYPASS# 


K 


CYCLE 

CONTROL 


CACHE 

CONTROL 


CACHE 

CONSISTENCY 


BUS 

ARBITRATION 


CYCLE 

DEFINITION 


BOUNDARY 

SCAN 


BE7#-BE0# 




LEN 


CACHE# 


NENE# 


PCD 


KBO 


BREQ 


M/IO# 


D/C# 


W/R# 


CTYP 


Figure 4.1. Signal Grouping 


NOTE: 

** Includes CS8-mode code fetches, which may be cached by the processor. 
—Indicates “don't care” values. 
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4.2 Signal Description 

In this section descriptions of all pins are presented 
in alphabetical order. 


4.2.1 A31 -A3 (ADDRESS PINS) 

The 29-bit address bus (A31 -A3) identifies address- 
es to a 64-bit location. Separate byte-enable signals 
(BE7#-BE0#) identify which bytes should be ac- 
cessed within the 64-bit location. 

The address lines are bidirectional. The I860 XP mi- 
croprocessor drives the address lines unless it is In a 
hold state. The system drives address lines A31 -A5 
to perform cache line inquiries (refer to the EADS# 
signal description). 

4.2.2 ADS# (ADDRESS STATUS) 

Tilt? i660 XF microprocessor asserts ADS# to iden- 
tify the first clock period of each bus cycle, the clock 
period during which new values become valid on the 
address bus and cycle-definition pins. This signal is 
held active for one clock. 

If BOFF# Is asserted, the processor floats ADS# 
two clocks after sampling BOFF# (and not, like all 
other pins, on the next clock). This is to ensure that 
ADS# is deasserted before it floats, and therefore Is 
never left floating active. 

ADS# can be asserted while AHOLD is active to 
Initiate a cache write-back cycle. 

4.2.3 AHOLD (ADDRESS HOLD) 

The external system asserts AHOLD to perform a 
cache inquiry. In response to assertion of AHOLD, 
the i860 XP microprocessor immediately (In the next 
clock) stops driving the address bus (A31 -A3 lines). 
The other buses remain active, and data can be 
transferred for previously issued read or write bus 
cycles during address hold. AHOLD is recognized 
even during RESET and LOCK#. The earliest that 
AHOLD can be deasserted Is the clock after EADS# 
is asserted to start the Inquiry. 

If HITM# has activated due to an inquiry, the 
i860 XP microprocessor asserts ADS# while 
AHOLD is active to start the write-back of the modi- 
fied line that was the target of the inquiry. 


4.2.4 BE7#-BE0# (BYTE ENABLES) 

The byte-enable pins are driven with the address. 
BE7# applies to D63-D56, BEO# applies to D7- 
DO. 


In write cycles (noncacheable writes as well as 
cache line write-backs), the BEa?# signals determine 
which bytes must be written into external memory 
for the current cycle. 

In read cycles, the BEn# values indicate which byte 
the load Instruction has requested. In all noncachea- 
ble read cycles (CACHE# or KEN# deasserted), 
the byte enables match the length and address of 
the requested data. Cacheable read cycles (KEN# 
asserted), however, result in four 64-bit memory 
transfers to fill an entire 32-byte cache line. The 
BE/7# pins activated are those that represent the 
operand of the load Instruction that caused the line 
fill, and these same BE/?# pins remain activated for 
as long as A31 -A5. All 64 bits must be returned for 
each cacheable cycle without regard for the BE/?# 
signals. 

While in CS8 mode, BE2#-BE0# serve as (active- 
high) lower-order address bits for instruction fetches 
(from the ROM). Data fetches and stores are not 
affected by CS8 mode, and BE2#-BE0# retain 
their normal byte-enable function for data. 


4.2.5 BERR (BUS ERROR) 

This is a nonmaskable interrupt input, which sup- 
ports bus error handling or other urgent circum- 
stances. BERR is not masked by the IM bit of the 
psr nor by lock cycles. When BERR is activated, the 
I860 XP microprocessor vectors to the trap handler 
and sets the bus error flag (BEF) in the epsr. BERR 
causes the physical address of the current bus cycle 
to be latched Into the BEAR control register; thus. If 
asserted the clock of BRDY# or the clock after 
BRDY#, it causes the bus address to be latched for 
software to examine. BERR is rising-edge sensitive. 
Once the trap has occurred, further BEF traps can- 
not occur until software has cleared BEF and read 
BEAR. 

BERR does not terminate outstanding bus cycles. 
Therefore, the system must still activate BRDY# a 
sufficient number of times or activate BOFF# for 
those cycles. Even though activating BOFF# tem- 
porarily halts the erring cycles, the i860 XP micro- 
processor will retry them when BOFF # is deassert- 
ed, in spite of BERR. 

Timing of BERR is not influenced by late back-off 
mode. 


4.2.6 BOFF# (BACK-OFF) 

The system can assert this signal to abort all out- 
standing bus cycles that have not yet completed. In 
response to BOFF#, the i860 XP microprocessor 
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immediately (in the next clock) floats its bus, except 
for ADS#, which is floated one clock later. The 
processor floats all the same pins normally floated 
during bus hold; however, unlike a bus hold, HLDA is 
not asserted. (HLDA is asserted only in response to 
HOLD; no acknowledgment is required for BOFF#.). 
Any data and BRDY# returned to the processor 
while BOFF# is asserted are ignored. The proces- 
sor remains in bus hold until BOFF# is deasserted, 
at which time it restarts the bus cycles by driving the 
address and cycle definition pins and asserting 
ADS#. When BOFF# deactivates, ADS# may be 
asserted the following clock. Thus a BOFF# dura- 
tion of one clock results in not floating ADS# at all. 
BOFF# cannot be used to force the pins to float 
during RESET; use HOLD for that purpose. 

4.2.7 BRDY# (BURST READY) 

The input BRDY# indicates either that the external 
system has driven valid data on the data pins in re- 
sponse to a read request or that the external system 
has latched the data in response to a write request. 
The CPU ignores this signal when no bus requests 
are outstanding. During a bus cycle, BRDY# is sam- 
pled at each clock, starting with the clock after as- 
sertion of ADS# and continuing until all data for the 
cycle has been transferred. When BRDY# Is sam- 
pled active in a read cycle, the data present on the 
pins is sampled. 


4.2.9 BYPASS# (BYPASS) 

This pin is reserved by Intel Corporation and should 
be tied HIGH to Vcc through a resistor. When LOW, 
the phase-locked loop that generates the internal 
clock Is unused. In this case, the Internal clock has 
more skew relative to the external CLK, and the A.C. 
timing parameters are not guaranteed. 


4.2.10 CACHE# (CACHEABILITY) 

This output signal indicates Internal cacheability of a 
bus request. Its timing follows that of the address 
bus. 


The i860 XP microprocessor asserts CACHE# for 
cacheable reads and code fetches to announce its 
Intention to cache the data. If CACHE# is asserted 
on a read cycle and if the KEN # input is active, the 
cycle is a burst line fill. If CACHE# is inactive in a 
read cycle, the I860 XP microprocessor does not 
cache the returned data, regardless of the KEN# 
pin. CACHE# is also asserted for cache line write- 
backs. 



CACHE# is inactive for noncacheable reads (for ex- 
ample, pfid, Idio, Idint), TLB replacements, and 
store misses. 


Table 4.4 shows how cacheability determines the 
number of data transfers In a cycle. 


4.2.8 BREQ (BUS REQUEST) 

BREQ allows the I860 XP microprocessor to share 
the local bus, with other bus masters. An external 
bus arbiter can use BREQ to implement an “on de- 
mand only” policy for granting the bus to the I860 XP 
microprocessor. The i860 XP microprocessor as- 
serts BREQ the clock after it realizes an internal re- 
quest for the bus. The system should sample this pin 
only when the I860 XP microprocessor is not in con- 
trol of the bus (that Is, when HLDA, BOFF#, or 
AHOLD Is active). BREQ Is undefined when the 
I860 XP microprocessor Is driving the bus. BREQ 
may be deasserted between assertions of ADS#, 
but this does not imply that the CPU does not need 
the bus. 


Note that the CACHE# output is always Inactive for 
CS8 (Code-Size 8 bits) mode instruction fetches so 
that the Instructions are fetched with single-transfer 
cycles. However, the code fetched may then be 
placed in the instruction cache, unless KEN# was 
Inactive. 


4.2.11 CLK (CLOCK) 

The CLK input determines execution rate and timing 
of the i860 XP microprocessor. External timing pa- 
rameters are specified relative to the rising edge of 
this signal. The i860 XP microprocessor can utilize a 
clock rate of 50 Mhz. The internal operating frequen- 
cy is the same as the external clock. This signal re- 
quires TTL levels. 


2-51 



iny. 


iSeOTM XP MICROPROCESSOR 




4.2.12 CTYP (CYCLE TYPE) 

CTYP is one of the bus cycle definition signals. Ta- 
bles 4.2 and 4.3 show the types of bus cycle gener- 
ated. CTYP Is defined only for data write and read 
requests. The value of this pin changes only when 
ADS# is asserted. 


4.2.13 D/C# (DATA/CODE) 

D/C# specifies whether the current request Is for 
data or Instructions. The data/code line is one of the 
bus cycle definition pins. Tables 4.2 and 4.3 show 
the types of bus cycle generated. The value of this 
pin changes only when ADS# is asserted. 


4.2.14 D63-D0 (DATA PINS) 

The bus interface has 64 bidirectional data pins 
(D63-D0) to transfer data in eight- to 64-bit quanti- 
ties. Pins D7-D0 Transfer tne least signmcant byte; 
pins D63-D56 transfer the most significant byte. In 
read cycles, all 64 bits of the data bus are latched, 
even in CS8-mode Instruction fetches when only the 
low-order eight bits are used. In write cycles, the 
I860 XP microprocessor does not drive D63-D0 in 
the clock of ADS#, but In the following clock. 

4.2.15 DP7-DP0 (DATA PARITY) 

There Is one parity signal for each byte of the data 
bus. They are driven by the i860 XP microprocessor 
with even parity Information on writes with the same 
timing as write data. Likewise, if parity checking is 
enabled by PEN#, the system must drive even pari- 
ty information on these pins with the same timing as 
read information to ensure that the correct parity 
check status is indicated^ by the i860 XP microproc- 
essor. “Even parity” means that the total number of 
set bits In a byte, including tlie parity bit. Is even. 
Refer also to the PCHK# signal. 

4.2.16 EADS# (EXTERNAL ADDRESS STATUS) 

This signal indicates that a valid external address 
has been driven onto address pins A31 -A5 of the 
i860 XP microprocessor to be used for a cache in- 
quiry. This signal is recognized while the processor 
is in hold (HLDA is driven active), while forced off the 
bus with BOFF# input, or while AHOLD is asserted. 
The i860 XP microprocessor ignores EADS# at all 
other times. EADS# is not recognized if HITM# is 
active, nor during the clock after ADS#, nor during 
the clock after a valid assertion of EADS#. Table 
4.5 shows when EADS is first sampled. It is then 
sampled In every clock as long as the hold remains 
active and HITM# remains inactive. 


Table 4.5. EADS# Sample Time 


Trigger 

EADS# First Sampled 

AHOLD 

HOLD 

BOFF# 

Second clock after AHOLD asserted 
First clock after HLDA asserted 
Second clock after BOFF# asserted 


INV and FLINE# are sampled in the same clock pe- 
riod that EADS# is validly asserted. HIT# and 
HITM# may be asserted as the results of a cache 
inquiry. 


4.2.17 EWBE# (EXTERNAL WRITE BUFFER 
EMPTY) 

At RESET, the value on EWBE# determines the or- 
dering mode. The. processor enters strong ordering 
mode if EWBE# is sampled active for at least the 
last three clocks before RESET deactivates; other- 

XA/ico it ontaro mCdC 

In weak ordering mode, the value of EWBE# after 
reset does not affect processor operation. 

In strong ordering mode, the external system asserts 
EWBE# as long as all external write buffers are 
empty. If an external write buffer is not empty 
(EWBE# deasserted) or the internal write buffer Is 
not empty, the processor delays data cache updates 
so as to keep the external order of writes the same 
as the programmed order. 

In systems that do not have external write buffers, 
EWBE# can be tied to Vss. if strong ordering Is de- 
sired, or to Vcc. if weak ordering Is acceptable. Re- 
fer to sections 5.3.3 and 5.3.4 for more explanation 
and for other ways to control write ordering. 

4.2.18 FLINE# (FLUSH LINE) 

The system asserts FLINE# to request that the 
i860 XP microprocessor write back a modified cache 
line before other outstanding bus cycles are com- 
pleted, if the line is hit by an external inquiry. If this 
pin Is active in the same clock that EADS# is assert- 
ed, the write-back cycle Is initiated, and the i860 XP 
microprocessor expects BRDY#s for the write-back 
before outstanding cycles (if any) are returned. If 
data transfer for another cycle Is currently in prog- 
ress when FLINE# is asserted (i.e. first BRDY# re- 
turned before HITM# asserted), the I860 XP micro- 
processor waits until the data transfers for that burst 
have completed, and only then does it assert the 
ADS# for the write-back. If the first BRDY# has not 
yet occurred for an outstanding cycle, NA# must be 
activated to trigger ADS# for the write-back. 
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At RESET, the value on FLINE# determines config- 
uration. The processor enters one-clock late back- 
off mode if FLINE# is sampled active for at least the 
last three clocks before RESET deactivates. 


4.2.19 HIT# (CACHE INQUIRY HIT) 

This pin is one output of inquiry cycles. If an Inquiry 
cycle hits a valid line in the caches of the i860 XP 
microprocessor (either data or instruction), HIT# is 
asserted two clocks after EADS# is activated. If the 
Inquiry cycle misses the caches, this pin is negated 
two clocks after EADS# activation. 

This pin changes its value only as a result of EADS# 
activation during AHOLD, HOLD, or BOFF# and re- 
tains Its value until two clocks after the next valid 
activation of EADS#. 

HIT# can be used to control the WB/WT# pin of 
other processors in a multiprocessor system. Activa- 
tion of HIT # Indicates that the Inquiring processors 
should cache the line as S-state, not E-state. 


4.2.20 HITM# (HIT MODIFIED LINE) 

This pin is an output of inquiry cycles. When an In- 
quiry hits a modified line in the internal data cache, 
the I860 XP microprocessor asserts HITM# two 
clocks after EADS# is activated. (Refer also to the 
EADS# signal.) The HITM# signal stays active until 
the last BRDY# for the corresponding write-back 
cycle. At all other times, HITM# is inactive. HIT# is 
also asserted when HITM# Is asserted (except for 
the special case of an Inquiry after the ADS# of a 
write-back). 


4.2.21 HLDA (BUS HOLD ACKNOWLEDGE) 

The i860 XP microprocessor activates HLDA In re- 
sponse to a hold request presented on the HOLD 
pin. Assertion of HLDA indicates that the i860 XP 
microprocessor has given the bus to another local 
bus master. It is driven active in the same clock that 
the I860 XP microprocessor floats its bus. All output 
pins are floated except LOCK#, BREQ, HLDA, 
PCHK#, HIT#, and HITM#. 

The time required to acknowledge a hold request is 
one clock plus the number of clocks needed to finish 
any outstanding bus cycles (maximum of four out- 
standing cycles of four burst transfers each for total 
of 16 transfers). If this hold latency is too long for a 
given application, BOFF# can be used instead. 

When leaving a bus hold, the I860 XP microproces- 
sor deactivates HLDA and, in the same clock period. 
Initiates a pending bus cycle, if any. 


4.2.22 HOLD (BUS HOLD) 

This pin, along with the output signal HLDA, is used 
for local bus arbitration. At some time after the 
HOLD signal Is asserted, the i860 XP microproces- 
sor releases control of the local bus and puts most 
bus interface outputs in floating state, then asserts 
HLDA — all during the same clock period. It main- 
tains this state until HOLD is deasserted. Instruction 
execution stops only If required instructions or data 
cannot be read from the on-chip instruction and data 
caches. The i860 XP microprocessor ignores HOLD 
until all outstanding bus cycles are complete (until 
the last BRDY#). The i860 XP microprocessor rec- 
ognizes HOLD even during RESET and LOCK#. 
HOLD cannot be used when the 82495XP cache 
controller is attached. 


4.2.23 INV (INVALIDATE) 

The external system asserts this signal to invalidate 
the cache-line state In the case of an inquiry cycle 
hit. It is sampled together with A31-A5 In the clock 
EADS# is active. 


4.2.24 INT/CS8 (INTERRUPT/CODE-SIZE 
EIGHT BITS) 

This input, like the BERR input, allows interruption of 
the current Instruction stream. The processor sam- 
ples INT as instruction boundaries. If interrupts are 
enabled (IM set in psr) when INT is sampled active, 
the I860 XP microprocessor fetches the next Instruc- 
tion from virtual address OxFFFFFFOO. INT is level 
triggered. To assure that an interrupt is recognized, 
INT should remain asserted until the software ac- 
knowledges the interrupt (by executing an interrupt- 
acknowledge cycle, for example). The interrupt may 
be Ignored by the processor If the INT signal does 
not remain active. 

Interrupt latency (the maximum time between asser- 
tion of INT and execution of the first instruction of 
the trap handler) depends both on the internal con- 
text and on the external system. After INT Is assert- 
ed, the I860 XP microprocessor finishes all instruc- 
tions currently being executed, including any out- 
standing bus cycles, before starting the trap handler. 
The following instruction sequence Is an example of 
the worst case: 

pfld.q 

pfld.q 

Id.l 

br 

Id.l 

st.l 
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If I NT is asserted during the execution stage of the 
last Id.l Instruction, the execution of the trap handler 
may have to wait for: 

• Two 2-transfer bursts (the pfid instructions) 

• Two data cache line fills (misses by the Id.l 
instructions) 

• Two data cache line write-backs (eliminating 
modified lines to open space for the fills) 

• Two instruction cache line fills (the target of the 
br and the first instruction of the trap handler) 

• Three TLB miss sequences of up to six nonpipe- 
lined accesses each (the br, the last Id.l, and the 
trap handler) 

The time to finish the above bus activities can be 
extended by inquiry cycles and associated write- 
backs initiated by an external cache or bus control- 
ler. 

Bccidcc the bus-relateci delays, lliw i550 XF micro- 
processor has internal freeze conditions that can de- 
lay interrupt response by up to 10 additional clocks. 

During a locked sequence, the INT pin is ignored, 
and the INT bit of epsr reflects the value on the INT 
pin. To limit the time that INT is ignored, the lock 
instruction can assert LOCK# for only 30-33 in- 
structions before trapping. 

This input is asynchronous, but appropriate setup 
and hold times must be met to insure recognition on 
any specific clock. 

If INT is asserted for at least the last three clock 
periods before the falling edge of RESET, the 
i860 XP microprocessor enters eight-bit code-size 
(CSS) mode. 

4.2.25 KBO, KB1 (CACHE BLOCK) 

For reads, these output signals define which cache 
block (line) is going to receive the data. For write- 
backs, these lines specify which block Is being 
flushed. They are driven together with cycle defini- 
tion for cacheable data reads, TLB replacement, 
code fetch cycles, and write-backs. External hard- 
ware can use these signals to observe changes to 
cache blocks. 


4.2.26 KEN# (CACHE ENABLE) 

The i860 XP microprocessor samples KEN# to de- 
termine whether the data being read for the current 
cache-miss cycle is to be cached. When the i860 XP 


microprocessor generates a read cycle that can be 
cached (CACHE# output active) and KEN# is ac- 
tive, the cycle is transformed Into a burst line fill. By 
activating KEN#, the memory system commits to a 
four-transfer burst. The entire 64 bits of the data bus 
are used for the read, regardless of the state of the 
byte-enable pins. 

If KEN# is sampled inactive, code fetches are not 
transferred in bursts, but 1 28-bit data items may still 
be transferred with a burst length of two. 

KEN# is sampled together with NA# or BRDY#, 
whichever comes first. It is sampled only with the 
first BRDY# of a burst; its value at any other time 
has no effect. 

4.2.27 LEN (DATA LENGTH) 

The LEN output pin specifies the number of burst 
transfers foi eaun ihis pin and the OAUHt# 

output pin are used by the system to determine the 
burst length for each cycle (refer to Table 4.4). The 
i860 XP microprocessor can generate 1, 2, or 4- 
transfer bursts for reads and writes. 

LEN is inactive if the internal request is for 64 bits or 
less. If LEN Is active, the Internal request is for 1 28 
bits or more, and the cycle should be returned as a 
two- or four-transfer burst. LEN is always active for 
128-bit data accesses. LEN is always inactive for 
code accesses. 

A cacheable read (CACHE# active) can be auto- 
matically converted to a four-transfer burst regard- 
less of LEN by assertion of KEN#. 

Table 4.4 summarizes different cycle lengths as they 
are calculated) from the LEN and CACHE# signals. 
LEN has the same timing as the address. 

4.2.28 LOCK# (ADDRESS LOCK) 

This signal is used to provide atomic (indivisible) 
read-modify-write sequences in multiprocessor sys- 
tems. The address to be locked is the one being 
driven on A31 - A3 when LOCK# is activated. A mul- 
tiprocessor bus arbiter must permit only one proces- 
sor a locked read, locked write, or unlocked write to 
that address and must maintain the lock of that loca- 
tion across cycle boundaries until LOCK# deacti- 
vates. The simplest arbitration hardware can just 
lock the entire bus against all other accesses during 
LOCK# assertion; however, software must never 
assume that this implementation is being used. 
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The i860 XP microprocessor coordinates the exter- 
nal LOCK# signal with the lock and unlock 
instructions. Programmers do not have to be con- 
cerned about the fact that bus activity is not always 
synchronous with instruction execution. LOCK# is 
asserted with ADS# for the address operand of the 
first load or store instruction executed after the lock 
instruction. 

After an unlock instruction, LOCK# is deasserted 
with the next load or store. The I860 XP microproc- 
essor deactivates LOCK# one clock after ADS# for 
the last locked bus cycle. Unlike the i860 XR micro- 
processor, the i860 XP microprocessor does not 
deassert LOCK# immediately when a trap occurs. 
Instead, the trap handler must execute a load or 
store instruction to deassert LOCK#. (The handler 
does not have to execute an unlock instruction, 
however. The unlocking function is performed by the 
processor’s trap logic.) 

The i860 XP microprocessor also asserts LOCK# 
during TLB miss processing for updates of the ac- 
cessed bit in page-directory and page-table entries. 
The maximum time that LOCK# can be asserted in 
this case is the time required to perform a nonpipe- 
lined, four-byte, read-modify-write, sequence. 

Between locked sequences, at least one cycle of no 
LOCK# is guaranteed by the behavior of the unlock 
Instruction. 


NA# is latched internally; the i860 XP microproces- 
sor remembers that NA# was asserted until it has 
an internal request to send to the bus; so, assertion 
of NA# for a single clock can trigger an ADS# sev- 
eral clocks later. NA# is ignored in the clock of 
ADS#. 

KEN# and WB/WT# inputs for the current cycle 
are sampled with NA#, if NA# is asserted before 
the first BRDY# of the current cycle. 

NA# is also used in conjunction with FLINE# to 
invoke write-back of a modified line during outstand- 
ing bus cycles. 


4.2.31 NENE# (NEXT NEAR) 

The i860 XP microprocessor asserts NENE# when 
the current address is in the same DRAM page as 
the previous bus cycle. This signal allows higher- 
speed reads and writes in the case of consecutive 
accesses to static column or page-mode DRAMs. 
The i860 XP microprocessor determines the DRAM 
page size by Inspecting the software-controlled DPS 
field in the dirbase register. The page size can 
range from 2^ to 64-bit words, supporting DRAM 
sizes from 256K x 1 to 4G x n. The value of this 
pin changes only when ADS# is asserted. NENE# 
is never asserted for the next bus cycle after the 
address bus has been floating (after AHOLD, 
BOFF#, or HLDA is deasserted). 



Between lock and unlock instructions, the INT pin is 

ignored. 4.2.32 PCD (PAGE CACHE DISABLE) 


Instruction fetches do not alter the LOCK# signal. 

4.2.29 M/IO# (MEMORY-l/0) 

M/IO# specifies whether the current cycle is for the 
memory address space or for the I/O address 
space. M/IO# is one of the bus cycle definition pins. 
Tables 4.2 and 4.3 show the types of bus cycle gen- 
erated. The value of this pin changes only when 
ADS# is asserted. 


4.2.30 NA# (NEXT ADDRESS REQUEST) 

NA# makes address pipelining possible. The sys- 
tem asserts NA# for at least one clock to indicate 
that it is ready to accept the next address from the 
i860 XP microprocessor. (If the system does not im- 
plement pipelining, NA# must not be activated.) The 
I860 XP microprocessor samples NA# every clock, 
starting one clock after the activation of ADS#. If 
the i860 XP microprocessor has a new cycle pend- 
ing internally when NA# is activated, it initiates that 
cycle in the clock after NA# is asserted. Up to three 
bus cycles can be outstanding simultaneously. 


PCD provides a cacheability indication on a page by 
page basis. This signal, together with PWT, is set to 
an attribute bit in the page table entry for the current 
cycle. When paging is enabled, PCD corresponds to 
the CD bit (bit 4) of the page table entry. The i860 XP 
microprocessor does not perform a cache fill to any 
page for which CD of the page table entry is set. 
When paging is disabled, or for any cycle that is not 
paged (Idio, stio, Idint, scyc), the i860 XP micro- 
processor drives PCD inactive. 

During TLB miss processing, PCD is inactive while 
the address translation hardware is accessing the 
first level page directory. During accesses to the 
second-level page-table entry, PCD reflects the CD 
values taken from the first level page-table entry. 

The value of this pin changes only when ADS# Is 
asserted. 


4.2.33 PCHK# (PARITY CHECK) 

This output shows the result of the parity check on 
data pins in the previous clock of a read cycle. It is 
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asserted for one clock when incorrect parity has 
been detected. It reflects the parity status for the 
entire data bus. 

PCHK# does not terminate outstanding bus cycles, 
so the system must still activate BRDY# a sufficient 
number of times or activate BOFF# for those cy- 
cles. PCHK# is always inactive after any code fetch 
in CSS mode. 

4.2.34 PCYC (PAGE CYCLE) 

The page cycle line is active during memory read or 
write cycles to distinguish page-table accesses from 
other accesses. The types of bus cycle generated 
are indicated in Tables 4.2 and 4.3. The value of this 
pin changes only when ADS# is asserted. 

4.2.35 PEN# (PARITY ENABLE) 

The i860 XP microprocessor samples this signal for 
read cycles um lliu binnt; uiuuk edge ai which 
BRDY# is found asserted. If sampled active, the 
i860 XP microprocessor feeds the parity check re- 
sult into the interrupt logic. If a parity error is encoun- 
tered, the i860 XP microprocessor vectors to the 
trap handler. The BEAR register latches the offend- 
ing address, as described with the BERR signal. 
This interrupt is not masked by the IM bit of the PSR, 
nor is it masked during lock cycles. 

The system should deassert PEN# any time the 
DP7-DP0 pins are known not to reflect the parity of 
the full eight-byte bus (for example, reads from I/O 
devices or ROMs that are not parity protected). 

The system should deassert PEN# during code 
fetches in CS8 mode. 

At RESET, the value of PEN# determines the out- 
put buffers configuration for ADS#, A21-A3, 
BE7#-BE0#, W/R#, HITM#. These pins are con- 
figured as normal (small output buffers) mode if 
PEN# is sampled active for at least the last three 
clocks before RESET deactivates. Otherwise, these 
pins are configured as high-current mode (large out- 
put buffers). 

4.2.36 PWT (PAGE WRITE-THROUGH) 

PWT provides a write-back/write-through indication 
on a page by page basis. This signal, together with 
PCD, is set to an attribute bit in the page table entry 
for the current cycle. When paging is enabled, PWT 
corresponds to the WT bit (bit 3), and write-back 
caching is implemented for this page only if WT is 
clear. When paging is disabled, or for any cycle that 
is not paged (Idio, stio, Idint, scyc), the i860 XP 
microprocessor drives PWT inactive. 

During TLB miss processing, PWT is inactive while 
the address translation hardware is accessing the 


first level page directory. During accesses to the 
second-level page-table entry, PWT reflects the WT 
value taken from the first level page-table entry. 

The value of this pin changes only when ADS# Is 
asserted. ; 

4.2.37 RESET (SYSTEM RESET) 

Asserting RESET for at least ten CLK periods caus- 
es initialization of the i860 XP microprocessor. On 
power up, RESET should remain active at least one 
millisecond after Vcc and CLK have reached their 
proper DC and AC specs. RESET is synchronous 
with CLK. 

After the RESET signal goes inactive the processor 
remains in the RESET state for three more clocks. 
Applications that use the HOLD signal to float the 
bus during RESET should keep HOLD active for 
three more clocks after the RESET signal Is deacti- 
vated. 

4.2.38 RSRVD, SPARE 

The RSRVD input is reserved by Intel Corporation 
and must be tied HIGH to Vcc through a resistor 
(5 KH). The spare input should be left unconnected. 

4.2.39 TCK (TEST CLOCK) 

This is the clock input for the TAP (test access port). 
If the TAP is to be used, this signal must be connect- 
ed to a clock synchronous to CLK. If the TAP is not 
used, TCK can be tied low. TCK does not need to be 
kept running when boundary scan is not active. 

The rising edge of TCK must be externally synchro- 
nized to CLK. The boundary scan latches retain their 
state when TCK Is stopped at either logic zero or 
one. 

4.2.40 TDI (TEST DATA INPUT) 

TDI Is the input for test Instructions and data to the 
TAP. TDI is sampled on the rising edge of TCK. It is 
provided with an internal pull-up resistor, so that an 
open circuit at TDI produces a result equivalent to 
driving continuous HIGH signals. 

4.2.41 TDO (TEST DATA OUTPUT) 

This is the serial output of the TAP. The contents of 
TAP registers are shifted out through TDO on the 
falling edge of TCK. The data Is moved from TDI to 
TDO without inversion, which allows easy serial cas- 
cading of different components for scanning. 

TDO is held in high-impedance state, except while 
scanning is in progress. This allows parallel connec- 
tion of these outputs for several components. 
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4.2.42 TMS (TEST MODE SELECT) 

This input is decoded by the TAP to select the oper- 
ation of the TAP. It is sampled at the rising edge of 
TCK. It is provided with an Internal pull-up resistor to 
assure deterministic behavior for open-circuit failure 
at this pin. If boundary scan is not used, TMS can be 
tied high or left unconnected. 


4.2.43 TRST# (TEST RESET) 

This input resets the TAP. If the TAP Is not used, 
TRST# should be tied LOW. To ensure determinist- 
ic behavior of the test logic, TMS should be held 
HIGH while TRST # changes from LOW to HIGH. 


4.2.44 Vcc (SYSTEM POWER) AND Vss 
(GROUND) 

The i860 XP microprocessor has 54 pins for power 
and 56 for ground. All pins must be connected to the 
appropriate low-inductance power and ground sig- 
nals in the system. 


4.2.45 VccCLK (CLOCK POWER) 

This is the power supply for the internal CLK buffer. 
It should be connected to the same Vcc plane as 
the other Vcc P'ns. 


4.2.46 WB/WT# (WRITE-BACK/WRITE- 
THROUGH) 

This input signal defines cache policy for the line 
being accessed in the current bus cycle. The proc- 
essor samples WB/WT # for both reads and writes 
on the same clock edge at which it finds NA# or the 
first BRDY # asserted, whichever comes first. If this 
signal is sampled low, the write-through policy is ap- 


plied to the cache line — if an internal write hits this 
line, it causes a write-through cycle. If this signal is 
sampled high, the write-back policy is applied — fu- 
ture write hits to this line do not show up on the bus. 


4.2.47 W/R# (WRITE/READ) 

This pin specifies whether a bus cycle Is a read 
(LOW) or write (HIGH) cycle. Tables 4.2 and 4.3 
show the types of bus cycle generated. The value of 
this pin changes only when ADS# is asserted. 


5.0 BUS OPERATION 


The interaction among signals is illustrated by timing 
diagrams. Figure 5.1 shows the conventions used in 
the timing diagrams. 



5.1 Bus Cycles 


A bus cycle begins when the i860 XP microproces- 
sor activates ADS# and ends when the system acti- 
vates the last of a predetermined number of BRDY # 
signals. Figure 4.4 shows how the i860 XP micro- 
processor and the external system cooperate to de- 
termine the number of BRDY# activations in each 
cycle. The processor starts sampling BRDY# one 
clock after assertion of ADS# and continues sam- 
pling in every clock until the last BRDY# becomes 
active. 


The i860 XP microprocessor supports several differ- 
ent types of bus cycle. These are introduced in order 
of complexity: 

1 . Single-transfer cycles 

2. Multiple-transfer (burst) cycles 

3. Pipelined cycles 

4. Cache inquiry cycles 
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5.1.1 SINdLE-TRANSFER CYCLE 

The simplest bus cycle is the single-transfer, non- 
cacheable, 64-blt cycle either with or without wait 
states. The shortest bus cycle is two clock periods 
long. Read and write cycles of this type are shown in 
Figure 5.2. 

A wait state is any clock In which the i860 XP micro- 
processor samples BRDY# but the system does not 
assert it. The system can add wait states to any cy- 
cle. Figure 5.3 shows cycles with two wait states 
added. Any number of wait states can be added to 
i860 XP microprocessor bus cycles by maintaining 
BRDY# inactive. 


5.1.2 BURST CYCLES 

When a bus request requires more than a single 
data transfer (refer to Table 4.4), the i860 XP micro- 
processor requires that the memory system perform 
a burst data transfer. Burst cycles allow the maxi- 
mum bus transfer rate by eliminating unnecessary 
driving of the address bus. The addresses of the 
data items in burst cycles all fall within the same 32- 
byte aligned area (corresponding to an internal 
i860 XP microprocessor cache line). Given the ad- 
dress of the first transfer, external hardware can cal- 
culate the addresses of subsequent transfers. With 
these addresses eliminated from the bus, a new 
data item can be sampled into the i860 XP micro- 
processor every clock period. 
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The fastest possible burst cycle requires two clock 
periods for the first data item: one clock for ADS# 
and one clock for BRDY#; subsequent data items 
are transferred every clock period. One such bus 
cycle is shown in Figure 5.4. Note that, in this case, 
the initial cycle generated by the i860 XP microproc- 
essor could be satisfied by a single data transfer, but 
the system transforms it into a multiple-transfer 
cache line fill by activating KEN # in the clock period 
of the first BRDY#. KEN# has this effect only if the 
CACHE# pin is active, which means the cycle is in- 
ternally cacheable In the i860 XP microprocessor. 


Read data is sampled only in the clock period In 
which BRDY# is returned, which means that data 
need not be sent to the I860 XP microprocessor ev- 
ery clock period in the burst cycle. Figure 5.5 shows 
an example of a burst cycle in which two clock peri- 
ods are required for every burst item. 

The burst length attributes LEN and CACHE# are 
driven with the address. Figure 5.6 Illustrates two 
consecutive burst cycles with differing length attri- 
butes: the first one is a noncacheable 1 28-blt read, 
and the second one Is a cache line fill initiated by a 
cacheable 64-blt read. 
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I I I 


NOTE: 

1. KEN# driven with first assertion of BRDY# 


Figure 5.4. Basic Burst Cycie 



- - r- - < TO CPU > - -I- - < TO CPU V - -i- - < TO CPU > - 


NOTE: 

1. Wait states added by delaying assertion of BRDY# 


Figure 5.5. Slow Burst Cycle 
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The timing of write bursts is similar to that of read 
bursts. The i860 XP microprocessor does not put 
data on D63-D0 for writes until the clock period af- 
ter ADS#. 

When initiating any read, the i860 XP microproces- 
sor presents the address for the data item request- 
ed. When the cycle Is converted into a cache fill, the 
first data Item returned corresponds to the address 
sent out by the i860 XP microprocessor. The remain- 
ing Items must be returned In the order shown in 
Table 5.1. This ordering is optimized for two-bank 
memories, but works equally well with noninter- 
leaved memories. 

In I860 XP microprocessor systems, memory must 
support the burst order as defined in Table 5.1 for 
reads. For writes, the burst addresses are always 
Increasing, so writes with four transfers match the 
first line of the table. In CS8 (code-size 8 bits) mode. 
Instructions are not fetched in bursts. 

Note that the i860 XP microprocessor drives only 
the first address of a burst cycle; the memory sys- 
tem Is responsible for calculating subsequent ad- 
dresses as shown In the table. The addresses can 
be derived by complementing A3 after every trans- 
fer, and complementing A4 after two transfers. 


Table 5.1. Burst Order for Cache Line Transfers 


1st 

Address 

2nd 

Address 

3rd 

Address 

4th 

Address 

0 

8 

0x10 

0x18 

8 

0 

0x18 

0x10 

0x10 

0x18 

0 

8 

0x18 

0x10 

8 

0 


5.1.3 PIPELINED CYCLES 

A pipelined cycle is one that starts while one or two 
other bus cycles are outstanding. A cycle is consid- 
ered outstanding until the last BRDY# is asserted to 
terminate that cycle. A nonpipelined cycle is one 
that starts when no other bus cycles are outstand- 
ing. Both types of cycle can be either read or write 
cycles. To allow high transfer rates in large memory 
systems, the i860 XP microprocessor supports two- 
level pipelining. New cycles can start as often as 
every other clock until three cycles are outstanding. 

The system asserts NA# to indicate that the 
i860 XP microprocessor can start another cycle be- 
fore the current one is completed. (NA# can even 
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be asserted while BRDY# is active.) The i860 XP 
microprocessor begins sampling NA# in the next 
clock after ADS# is asserted. If the following condi- 
tions are met, a new (pipelined) cycle begins: 

1. NA# having been active 

2. An internal request pending 

3. Compatibility between the pending request and 
the outstanding requests (refer to Table 5.2) 

4. HOLD, BOFF#, and AHOLD not active 

5. Fewer than three cycles outstanding 

The following “compatibility” rules determine when 
the processor does not issue a pipelined ADS# 
(they are the source of Table 5.2); 

• Data cache line fills are pipelined into each other 
only in the case of an aliasing virtual tag miss with 
a physical tag hit. 


• Reads can be pipelined into TLB miss writes. TLB 
misses for instructions can be pipelined Into data 
accesses, and wice versa. 

• No data cycle is ever pipelined while LOCK# is 
active. 

• I/O cycles, special cycles, and Idint cycles never 
begin when any cycle is outstanding. 

NA# may be asserted before, simultaneously with, 
or after the first BRDY# of the current cycle. If NA# 
is asserted before the first BRDY#, the cacheability 
(KEN#) and cache policy (WB/WT#) indicators for 
the current cycle are sampled during the same clock 
period as NA# is sampled active; otherwise, they 
are sampled with the first BRDY#. Figure 5.7 shows 
an example of four-transfer, pipelined, back-to-back 
reads. Note the timing of KEN#. Because NA# is 
asserted before the first BRDY# of the cycle A, 
KEN# is sampled with the NA# for cycle B. 


Table R.9. Pineline Oycle Compatibility 


B 

If A is Outstanding, can B be Pipelined into It? 

Data 

Cache 

Line Fill 

Data Cache 
Store Miss, 
Write-Thru 

Data Cache 

Read Miss 

KEN# = 1 

Write- 

Back** 

Instruction 

Fetch 

pfId 

TLB 

Miss 

Idio, stio, 
Idint, scyc 

LOCK# 

Active 

A 

PREVIOUS CYCLE 

Data 

Cache 

Line Fill 

YES* 

YES* 

YES* 

YES 

YES 

YES* 

YES 

NO 

YES 

Data Cache 
Store Miss, 
Write-Thru 

YES 

YES 

YES 

YES 

YES 

YES 

YES 

NO 

YES 

Data Cache 

Read Miss 

KEN# = 1 

YES* 

YES* 

YES* 

YES* 

YES 

YES* 

YES 

NO 

YES* 

Write-Back 

YES 

YES 

YES 

NO 

YES 

YES 

YES 

NO 

YES 

instruction 

Fetch 

YES 

YES 

YES 

YES 

YES 

YES 

YES 

NO 

YES 

pfid 

YES 

YES 

YES 

YES 

YES 

YES 

YES 

NO 

YES 

TLB Miss 

YES 

YES 

YES 

YES 

YES 

YES 

YES 

NO 

YES 

stio 

scyc 

YES 

. YES 

YES 

YES 

YES 

YES 

YES 

NO 

YES 

idio 

idint 

NO 

NO 

NO 

NO 

YES 

NO 

YES 

NO 

NO 

LOCK# 

Active 

NO 

NO 

NO 

NO 

YES 

NO 

YES 

NO 

NO 


NOTE: 

* Pipelining can occur if the first ADS# is for an aliasing virtual tag miss with a physical tag hit. 
**lnquiry write-backs are not pipelined into prior cycle unless FLINE# is asserted. 
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Write cycles can be pipelined into read cycles and 
vice versa, but, in both cases, the processor will 
leave one clock between bursts to allow bus turn- 
over, and will ignore any BRDY # given to it at that 
time. Pipelined back-to-back read and write cycles 
are shown in Figure 5.8. On writes, assertion of NA# 
does not cause the values on the data bus to 
change; It just enables new address and cycle speci- 
fication outputs. 


5.1.4 INTERRUPT ACKNOWLEDGE CYCLES 

In response to a trap caused by assertion of the INT 
pin, trap-handling software can generate interrupt 
acknowledge cycles by executing a procedure simi- 
lar to the following. 


//The following 

lock 

instruction must be on a 32-byte boundary; 

lock 




// Lock the bus 

Idint .b 

src2, 

rdest 

// First INTA cycle. Src2 contains 8. 

or 

rdest , 

rO, 

rdest 

// Won’t proceed until rdest loaded. 

unlock 




// Unlock the bus after the next Idint 

//nop 




it Insert 4 + <number of N0Ps> idle 

//nop 




// clocks for 8259A recovery. 

Idint .b 

rO, 

rdest 

// Second INTA cycle 
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Figure 5.9 shows the interrupt acknowledge cycles 
generated by the code sequence. Interrupt acknowl- 
edge cycles are generated in locked pairs. The Inter- 
rupt vector Is returned during the second cycle. Each 
of the interrupt acknowledge cycles is terminated 
when the external system responds by asserting 
BRDY#. Wait states can be added by withholding 
BRDY#. There must be a number of idle clocks be- 
tween the first and second cycles to allow for 8259A 
recovery time. The software controls the number of 
intervening clocks via the number of nop instruc- 
tions In the Interrupt acknowledge routine. 

5.1.5 SPECIAL BUS CYCLES 

The i860 XP microprocessor provides a special cy- 
cle to Indicate to the external system that certain 


internal conditions have occurred. The special bus 
cycle (indicated by M/IO# = 0, D/C# = 0, and 
W/R# = 1) Is generated by the i860 XP microproc- 
essor as a response to scyc instruction execution. 
This cycle (defined in Table 5.3) is used to flush or 
invalidate a secondary cache. The defined value of 
byte enables can be generated by using an appropri- 
ate address operand in the scyc Instruction. The 
scyc Instruction does not have any effect on the 
Internal caches. External hardware must acknowl- 
edge a special bus cycle by asserting BRDY# once. 
The data driven on the data bus with BRDY# is 
undefined. The effect of scyc is determined by de- 
coders in external hardware. 
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Table 5.3. Encoding of Special Bus Cycles 


BE7#-BE0# 

Special Bus Cycle 

11110111 

11111011 

11111101 

11111110 

Write Back External Cache and Invalidate 

Halt 

Invalidate External Cache 

Shut Down 


All other encodings are reserved. 


5.2 Bus Arbitration 

The i860 XP microprocessor responds to three dif- 
ferent signals that tell it to stop driving the bus; 

HOLD Finishes outstanding cycles before giving 
up the bus. 

BOFF# Aborts outstanding cycles and gives up bus 
immediately. 

AHOLD Stops driving address bus and permits a 
cache inquiry. 

AHOLD results in a partial hold state, which is cov- 
ered in Section 5.3. The present section concen- 
trates on HOLD and BOFF#. 

When in a hold state (due either to HOLD or 
BOFF#), the I860 XP microprocessor uses BREQ to 
request control of the bus. If holding due to HOLD, 
AHOLD, or BOFF#, the processor activates BREQ 
in the clock after an Internal bus request is generat- 


ed. (In the case of HOLD, BREQ is asserted even 
though HLDA is asserted.) If holding due to BOFF# 
and cycles need to be restarted or there Is a new 
internal request. It asserts the BREQ signal within 
four clock periods after the assertion of BOFF#. In 
all cases, BREQ remains active at least until the 
clock after ADS# is activated for the requested cy- 
cle. . 


5.2.1 HOLD AND HLDA ARBITRATION 

HOLD indicates to the i860 XP microprocessor that 
another bus master needs control of the bus. When 
HOLD is asserted, the I860 XP microprocessor 
keeps control of the bus until all outstanding cycles 
are completed. Then It floats the output signals (ex- 
cept BREQ, HLDA, LOCK#, PCHK#, HIT#, and 
HITM#) and asserts HLDA. These outputs remain at 
the high-impedance state until HOLD is deasserted. 
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HLDA may be asserted as soon as the clock period 
after the one in which HOLD is asserted. HLDA may 
be deasserted as soon as the clock after the one in 
which HOLD is deasserted. 

An example HOLD/HLDA transaction is shown in 
Figure 5.10. The i860 XP microprocessor recognizes 
HOLD even while RESET Is asserted, and it drives 
HLDA in this case as well. 

HOLD Is recognized even when BOFF# is active, 
and the i860 XP microprocessor responds with 
HLDA the same as when the bus is idle. 


5.2.2 BUS CYCLE BACK-OFF AND RESTART 

The i860 XP microprocessor provides the ability to 
abort bus cycles and restart them again. It is neces- 
sary to abort cycles for reasons such as the follow- 
ing: 

1 . Rpitrv flftpr an error jc detected by ECC cr parity 
logic. 

2. Escape from a deadlock; for example, when the 
i860 XP microprocessor is using A31 -A3 to load 
a new cache line, but the 82495XP cache con- 
troller needs A31 -A5 to invalidate a line in the 
CPU cache which the 82495XP cache controller 
is replacing in its cache in order to satisfy the 
CPU’s line-fill request. 


3. Maintain cache consistency; for example, the 
i860 XP microprocessor is attempting to read or 
write to a line that has been modified in the cache 
of another CPU. 

4. Prevent illegal access to an address already 
locked by another CPU in a multiprocessor sys- 
tem. 


5.2.2.1 Cycle Back-Off 

Bus cycles are aborted when the system asserts 
BOFF#. The i860 XP microprocessor samples this 
pin in every clock period that it is driving the bus. 
When BOFF# Is asserted, the i860 XP microproces- 
sor Immediately (in the next clock period) floats the 
bus. It floats the ADS# pin one clock period later, 
thereby giving time for ADS# to be deasserted so 
that it is not left floating active. The i860 XP micro- 
processor floats the same pins as for HOLD, but 
HLDA is not asserted. If a bus cycle is in progress at 
the tirTiS 50FF # is asserted, me Cyuit; is> 
and, In a read cycle, any data returned to the proc- 
essor while BOFF# is active is ignored. BOFF# 
overrides BRDY#; so, If both are sampled active in 
the same clock, BRDY# is ignored. BOFF# aborts 
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a burst cycle even if it arrives with the last BRDY# 
of the cycle. However, for read bursts, data transfers 
completed before assertion of BOFF# are used by 
the processor if they satisfy an internal request. 
Cacheable data Is cached in spite of BOFF # ; how- 
ever, the cached data is overwritten when the cycle 
Is restarted. 

The bus remains In the high-impedance state until 
BOFF# is deasserted. If cycles need to be restarted 
or if a new internal request has been generated, the 
BREQ signal is asserted within four clock periods 
after the assertion of BOFF#. 


5.2.2.2 Cycle Restart 

When the system deasserts BOFF#, the i860 XP 
microprocessor restarts aborted bus cycles from the 
beginning by driving the address and status (A31 - 
A3, W/R#, D/C#, etc.) and asserting ADS#. If 
more than one cycle was outstanding when BOFF# 
was asserted, the I860 XP microprocessor restarts 
all outstanding cycles In the same order. If HUM# is 
active due to an Inquiry, the write-back for It will be 
the first cycle after deassertion of BOFF#. BOFF# 
restarts all aborted cycles except: 

• The stale cycles mentioned In section 5.3.5. 

• The read that may have been generated by an 
alias hit (virtual tag miss, but physical tag hit). 

• The read that may have been generated by a 
pfid that hit the data cache. 

If the processor’s KEN# pin was active (with NA# 
or first BRDY#) before the cycle was aborted, exter- 
nal hardware must activate it again after the cycle is 
restarted. In other words, the system cannot use 
BOFF# to change the cacheablllty of a cycle via 
KEN#. 

The LOCK# signal is not affected by restarted cy- 
cles; it retains its state in spite of BOFF# assertion. 


5.2.2.3 Late Back-Off Modes 

In some cases the logic that needs to assert 
BOFF# cannot make the necessary decision in time 
to cancel the relevant cycle or data transfer. For ex- 
ample: 

1 . The result of checking ECC or parity may not be 
available until one or two cycles after the BRDY# 
to which it corresponds. 

2. When the I860 XP microprocessor is attempting 
to read or write to a line that might be modified in 
the cache of another processor on the same bus, 
it may be advantageous to let part of a burst run 


In parallel with inquiries to the other processors, 
rather than delay the entire burst until the inquir- 
ies are finished. 


For such situations, the I860 XP microprocessor pro- 
vides late back'Off mode. For a read cycle In this 
mode, the processor employs a buffer to Internally 
delay data and BRDY#, which allows BOFF# as- 
sertion to be delayed relative to the external 
BRDY#. Likewise, for a write cycle in this mode, 
BOFF# assertion can be delayed relative to 
BRDY#. However, data for a write cycle is not de- 
layed. 


Two flavors of late back-off mode are provided: 


1. One allows BOFF# to be delayed by one clock 
period relative to the data transfer. The proces- 
sor enters one-clock late back-off mode when 
the FLINE# pin has been sampled active for at 
least three clock periods when RESET deacti- 
vates. 



2. The other allows BOFF# to be delayed by up to 
two clock periods relative to the data transfer. 
The i860 XP microprocessor enters this mode 
when software sets the LB bit of the dirbase 
register. 


If the processor enters one-clock late back-off mode 
during RESET, it is impossible to enter two-clock 
late back-off mode. The LB bit has no effect. Fur- 
thermore, software cannot exit two-clock late back- 
off mode once it is activated, and the LB bit cannot 
be cleared except by resetting the processor. 

Figures 5.12-5.17 Illustrate variations on late back- 
off mode cycles. BOFF # can be (and usually is) as- 
serted longer than one clock period, as Figure 5.11 
shows; the remaining figures show an active time of 
only one clock. 


5.2.2.4 One-Clock Late Back-Off Mode 

In one-dock late back-off mode the data is delayed 
internally by one clock before It is used. 

In this mode, data and BRDY# are seen by internal 
logic one clock period later than they appear on the 
bus, which is equivalent to adding an extra wait state 
to reads on the external bus (Figure 5.13). All re- 
sponses to BRDY# (assertion of the ADS# for the 
next cycle, assertion of HLDA in response to a 
HOLD request, and deassertion of HUM#) are de- 
layed by one clock period compared to the normal 
mode of operation. Not delayed, however, are write 
data on D63-D0 and sampling of KEN# and WB/ 
WT#. KEN# and WB/WT# must be valid with the 
first BRDY# assertion. Also, the response to NA# 
(assertion of ADS#) is not delayed if fewer than 
three pipelined cycles are outstanding. 
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If BOFF# is asserted as late as the second BRDY# 
(Figure 5.14), it cancels the entire cycle, ignores 
data latched with the first BRDY#, and ignores the 
data being driven with the second BRDY#. This is 
true of a two-transfer burst (shown) as well as a four- 
transfer burst (not shown). 

In a two-transfer burst, if BOFF# is asserted in the 
clock after the second BRDY# (Figure 5.15), it still 
cancels the cycle. 

In a four-transfer burst, if BOFF# is asserted within 
one clock after the last BRDY# (Figure 5.16), it still 
forces a retry of the cycle, but previously transferred 
read data Is used by the processor if it satisfies the 
read request. 


5.2.2.5 Two-Clock Late Back-Off Mode 

Two-dock late back-off mode gives external logic 
even more time to decide to use BOFF#. In this 


mode, data delivery is delayed by either one or two 
clock periods, depending on external activity. For 
any BRDY#, the data is delayed by one clock peri- 
od. If In the next clock period BRDY# is again as- 
serted, the previous data is used. However, if in that 
next clock period BRDY# remains inactive, the data 
is delayed for one extra clock period before it Is 
used. The responses to BRDY# (assertion of the 
ADS# for the next cycle, assertion of HLDA, and 
deassertion of HITM#) are delayed by one or two 
clock periods, depending on the value of BRDY# in 
the next clock. The response to NA# (assertion of 
ADS#) is not delayed if fewer than three pipelined 
cycles are outstanding. 

The st.c dirbase instruction that sets the LB bit 
must be aligned on a 32-byte boundary and must be 
followed by seven nop Instructions. Software must 
not enable late back-off mode when the processor is 
used with the 82495XP external cache controller. 
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Figure 5.15. One-Clock Late Back-Off Mode (Case 2) 
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NOTES: 

A Cacheable 64-bit (or less) cycle (four transfers) 

B Next cycle (any type) 

1. BOFF# cancels A2 and A3 transfers, but A1 transfer has already satisfied request 

2. Cycle A restarts one clock after BOFF # is deasserted 

3. Earliest ADS# assertion for next cycle 

Figure 5.16. One-Clock Late Back-Off Mode (Case 3) 


1 , 2 . 3 , 4 , 5 , 6 , 7 . 8 , 9 , 10 , 11 . 12 . 13 



Figure 5.17. Two-Clock Late Back-Off Mode 


5.3 Cache Inquiry Cycles (Snooping) 

Another processor initiates an inquiry cycle to check 
whether an address is cached in the internal data or 
instruction cache of the i860 XP microprocessor. An 
inquiry cycle differs from any other cycle in that it is 
initiated externally to the I860 XP microprocessor, 
and the signal for beginning the cycle is EADS# (Ex- 
ternal Address Status) instead of ADS#. The ad- 
dress bus of the i860 XP microprocessor is bidirec- 


tional In order to allow the address of Inquiry to be 
driven by the system. An inquiry cycle can begin dur- 
ing any hold state: 

1. While HOLD and HLDA are asserted. 

2. While BOFF# is asserted. 

3. While AHOLD (address hold) is asserted. 
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If neither a HOLD nor a BOFF# Is in effect, the sys- 
tem can assert AHOLD to interrupt the current bus 
activity. 

EADS# is first sampled two clocks after BOFF# or 
AHOLD assertion, or one clock after HLDA. This al- 
lows time for the processor to float A31 -A5 and for 
the system to stabilize the inquiry address there. 

In the clock in which EADS# is asserted, the 
I860 XP microprocessor samples these inputs, 
which qualify the type of inquiry: 

INV Specifies whether the line (if found) must 
be invalidated (that is, changed to l-state). 

FLINE# Specifies whether the line (if found in M- 
state) must be written back immediately or 
after outstanding bus cycles are complet- 
ed. 

The i860 XP microprocessor compares the address 
of the inquiry rcqucct with addresses of lines in 
cache and of any line in the write-back buffer waiting 


to be transferred on the bus. It does not, however, 
compare with the address of wrIte-mIss data In the 
write buffers. Two clock periods after sampling 
EADS#, the i860 XP drives the results of the inquiry 
look-up on these output pins: 

HIT# Specifies whether the address was found 
(active) or not found (inactive). 

HITM# If active, the line found was In the M-state; 
if Inactive, the line was in E- or S-state, or 
was not found. 

Figure 5.18 shows an inquiry with AHOLD that miss- 
es the cache. When the system asserts AHOLD, the 
I860 XP microprocessor floats A31-A3 in the next 
clock period. It does not, however, assert HLDA; no 
acknowledge is required. Once the address pins are 
floating, external logic drives the address for the in- 
quiry on A31 -A5 and starts the inquiry cycle by acti- 
vating EADS#. The i860 XP microprocessor does 
not begin sampling EADS# until the second clock 
auei AnCLD lb ttulivaied. EADS# aciivation may oe 
delayed any number of clocks. 
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The earliest that AHOLD can be deasserted is the 
clock after EADS# assertion. However, by maintain- 
ing AHOLD active, multiple inquiry cycles can be ex- 
ecuted in one AHOLD session (Figure 5.19). The 
i860 XP microprocessor can accept inquiry cycles at 
a rate of one every other clock period, unless a 
write-back is required. The earliest that ADS# can 
be asserted for the next cycle is the clock after 
AHOLD deassertion. 

The second inquiry in Figure 5.19 hits an unmodified 
line In the cache. When a cache line with matching 
address is found and the INV input signal is asserted 
(as in this case), that line is invalidated (changed to 
l-state). If the INV signal is inactive, the line enters 
S-state. 


5.3.1 INQUIRY WRITE-BACK CYCLES 

If an Inquiry cycle hits a dirty (M-state) line in the 
i860 XP microprocessor cache, the I860 XP micro- 
processor asserts the HITM# signal to indicate that 
the line will be written on the bus. The HITM# output 
becomes valid in the same clock period as HIT#. In 
this case the modified line Is written out, and the 
cache entry Is changed to either I or S state accord- 
ing to INV. The HITM# signal stays active through 
the last BRDY# for the corresponding write-back 
cycle. 

An inquiry write-back cycle is similar to ordinary 
write-back cycles. It is initiated by assertion of 
ADS#. ADS# is asserted even when the AHOLD 
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NOTES: 

A Outstanding cycle (for example, a single-transfer read) finishes during the inquiry 
B Earliest inquiry, no invalidation 
C Earliest successive inquiry, with invalidation 

1. EADS# is not sampled In the clock after its assertion 

2. Inquiry B misses cache 

3. Earliest deassertion of AHOLD is one clock after last assertion of EADS# 

4. Inquiry C hits cache, invalidates line 

5. Earliest assertion of ADS# for next cycle is one clock after deassertion of AHOLD 
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Figure 5.19. Fastest inquiry Cycles (Miss and Hit) 
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signal is active. The cycle definition signals are driv- 
en properly by the processor, however, the address 
pins are not driven, because activation of AHOLD 
forces the i860 XP microprocessor off the address 
bus. If, however, AHOLD is deasserted before or 
during the write-back cycle, the i860 XP microproc- 
essor drives the correct address for the write-back. 

For all types of inquiry, the write-backs are not pipe- 
lined Into an outstanding cycle, except when the 
FLINE# pin is used (refer to section 5.3.5). ADS# 
for the Inquiry write-back is asserted from one to four 


clock periods after the HUM# pin is driven active or 
after the last BRDY# is returned for any outstanding 
cycle, whichever occurs later. 

Bursts for a HUM # write-back, as for any write- 
back, are In the order 0, 8, 0x10, 0x18, because the 
i860 XP microprocessor Ignores A4-A3 of the in- 
quiry address. 

Figure 5.20 shows an inquiry cycle that hits an M- 
state line. 
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The fact that a write-back cycle is initiated while ad- 
dress lines are floating supports multiple inquiries 
(with write-backs) during a single AHOLD session. 
This Is especially useful during secondary cache re- 
placement processing, when the secondary-cache 
line is larger than that of the I860 XP microproces- 
sor. 

Note that EADS# is ignored as long as HITM# is 
active. If the system is executing a series of inquir- 
ies, it might happen that the HITM# assertion for 
one inquiry masks the EADS# for a subsequent in- 
quiry. In that case the system must reassert EADS# 
to restart the masked inquiry. 

Inquiries can occur during a hold due to HOLD/ 
HLDA or BOFF#. However, in these cases, the cy- 
cle definition pins and ADS# are floating. If an in- 
quiry requires a write-back, the HOLD or BOFF# 
must be deasserted so that the cycle definition pins 
and ADS# can be driven to start the write-back cy- 
cle. If HITM# is active at the time of ADS#, the first 
ADS# issued after HOLD is deasserted corre- 
sponds to the write-back of the modified line which 
was snooped. 

5.3.2 SNOOPING RESPONSIBILITY LIMITS 

The i860 XP microprocessor takes responsibility for 
responding to , Inquiry cycles for a cache line only 
during the time that the line is actually In the cache 
or in a write-back buffer. There are times during the 
cache line fill cycle and during the cache replace- 
ment cycle when the line is “in transit”, and inquiry 
(snooping) responsibility must be taken by other sys- 
tem components. 

Systems designers should consider the possibility 
that an inquiry cycle may arrive at the same time as 
a cache line fill or replacement for the same ad- 
dress. This situation can occur: 

• In multiprocessor systems that have external 
(secondary) caches with separate CPU and 
memory busses, thereby allowing concurrent ac- 


tivity on the two busses. In such systems, it is 
desirable to run Invalidation cycles concurrently 
with other I860 XP microprocessor bus activity. It 
can happen that writes on the memory bus cause 
invalidation requests to the i860 XP microproces- 
sor at the same time that the I860 XP microproc- 
essor fetches data from the secondary cache. 
Such events can occur at any time relative to 
each other. 

• In multiprocessor systems with no secondary 
cache, if memory is dual-ported. In such systems, 
two processors can simultaneously read the 
same line, each sending an inquiry to the other. 


The simultaneous activities considered here may be 
for different data Items in the same cache line. Un- 
less the inquiry request is timed carefully with re- 
spect to the cache fill cycle, the cache-consistency 
mechanism may be subverted, and data inconsist- 
encies may result (for example, both CPUs may get 
the line in E-state on a read). If the 82495XP and 
82490XP cache is being used, the timing with re- 
spect to the i860 XP microprocessor is handled cor- 
rectly by the cache controller; however, the same 
problem may arise between the memory system and 
the secondary cache. 



There are two cases to consider: 

1 . Inquiry for a line that is being cached. 

2. Inquiry for a line that is being replaced. 


5.3.2. 1 Inquiry for a Line Being Cached 

The i860 XP microprocessor accepts an Inquiry cy- 
cle at any time, even If it hits the line being cached at 
that time. Regardless of the timing of the cycle, the 
i860 XP microprocessor delivers the read data to the 
load instruction that initiated the read request. How- 
ever, the timing of the invalidation cycle determines 
whether the line is placed in the cache and what 
value the I860 XP microprocessor drives on HIT#. 
Table 5.4 summarizes the different cases. 


Table 5.4. Inquiry for a Line being Cached 



EADS# before 
or with NA# 
or 1st BRDY# 

EADS# after 

NA# or 

1st BRDY# 

Line is cached? 

YES 

NO 

HIT# = 

Inactive 

Active 

Data/ Instruction 

YES 

YES 

used by CPU? 
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If EADS# is asserted before or with the sampling of 
KEN # , the processor cannot match the address of 
the line being cached with an invalidation request. 
Thus, the processor does not assert H[T#. The ex- 
ternal system must satisfy the inquiry with the cor- 
rect data and WB/WT # status. If invalidation of that 
line is required, the system must do one of the fol- 
lowing: 

• Delay assertion of EADS# until one clock after 
assertion of KEN # . 

• Reassert EADS# after KEN#. 



• Make KEN# Inactive at the first BRDY# or NA#, 
thereby preventing the line from being cached. 

Figures 5.21 and 5.22 show when the I860 XP micro- 
processor picks up responsibility for inquiries for a 
line that it is caching. Figure 5.21 shows the earliest 
EADS# assertion that invalidates the line being 
cached relative to the first BRDY# for nonpipellned 
cycles. Figure 5.22 shows the earliest EADS# as- 
sertion that Invalidates the line being cached relative 
to the first NA# for pipelined cycles. These timings 
hold for normal and late back-off modes. 


2-76 




i860TM XP MICROPROCESSOR 




ini^» 




5.3.2.2 Inquiry for a Line Being Replaced 

When the i860 XP microprocessor is replacing a line, 

there are two cases: 

1. If the replacement does not require write-back, 
the address being replaced can be matched by 
an Inquiry until assertion of NA# or first BRDY# 
of the line-fill cycle. From that point on, the in- 
quiry has no effect. 

2. If the replacement requires a write-back, the ad- 
dress being replaced can be matched by an In- 
quiry until assertion of the last BRDY# for the 
write-back. An EADS# as late as two clocks be- 
fore the last BRDY# can cause HUM# to be 
asserted. 


Figures 5.23 through 5.25 show when the i860 XP 
microprocessor drops responsibility for recognizing 
inquiries for a line that it is writing back. They show 
the latest EADS# assertion that can cause HUM# 
assertion. In late back-off mode, EADS# can be as- 
serted later, because BRDY# Is Internally delayed 
(Figures 5.24 and 5.25). 

In all these cases, HUM# remains active for only 
one clock period. HUM#, as always, remains active 
through the last BRDY# of the corresponding write- 
back; in these cases the write-back has already 
completed. 

If an Inquiry cycle hits the write-back address after 
its ADS# has been Issued, the i860 XP microproc- 
essor asserts HUM#; however, HU# is deassert- 
ed. This unique combination of values on HU # and 
HUM# indicates that the write-back cycle corre- 
sponding to the HUM# has already been Issued. 
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NOTES: 

A Write-back cycle 

S Snoop (inquiry) cycle 

R Addresses of cycles A and S are the same 


Figure 5.23. Latest Snooping of Write-Back (Not Late Back-Off Mode) 
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5.3.3 WRITE CYCLE REORDERING DUE TO 
BUFFERING 

The MESI cache protocol and the ability to perform 
and respond to inquiry cycles guarantee that writes 
to the cache are logically equivalent to writes that go 
to memory. In particular, the order oi read and write 
operations on cached data Is the same as If the op- 
erations were on data in memory. Even uncached 
memory read and write requests usually occur on 
the external bus In the same order that they are Is- 
sued in the program. For example, when a write miss 
Is followed by a read miss, the write data goes onto 
the bus before the read request is put on the bus. 
However, the posting of writes in write buffers cou- 
pled with inquiry cycles may cause the order of 
writes seen on the external bus to differ from the 
order they appear In the program. Consider the fol- 
lowing example, which Is illustrated in Figure 5.26: 

1 . Three bus cycles are outstanding. 

2. Processor 1 executes a store to address A, which 
misses the cache. This store is posted; that is, 
the data is latched in the write buffer while the 
processor continues execution without waiting for 
the store to be completed on the bus. In this case 
the store is not even put on the bus because 
there are already three outstanding cycles. 


3. Processor 1 executes a store to address B, which 
hits the cache. 

4. Processor 2 executes an Inquiry for address B. 
Processor 1 looks In Its cache, finds the modified 
line, asserts HIT# and HITM#, and executes a 
write-back cycle to address B, while the data for 
address A is still in the write buffer. 

5. Processor 1 issues the write to address A on the 
bus. 

In this example, the original order of the writes has 
been changed. In most cases it is not necessary that 
the . ordering of writes be strictly maintained. But 
there are cases (for example, semaphore updates in 
a multiprocessor system) that require stores to be 
observed externally in the same order as pro- 
grammed. There are several ways to ensure seriali- 
zation of stores: 

1. Bracket one of the stores with the lock and 
unlock instructions. That forces serialization of 
the stores (refer to section 5.4). In the above ex- 
ample of a store-miss followed by store-hit, lock- 
ing either store would ensure that the internal 
store-hit does not update the cache until the miss 
gets to the external bus. 

2. Apply the write-through policy to the critical data, 
by setting WT = 1 in the page table entries or by 
driving the WB/WT # pin low. 
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Figure 5.26. Write Reordering due to Buffering 


3. Configure the processor for Strong Ordering 
Mode by asserting EWBE# during RESET. 

Option 1 is implementable by user-level programs, 
while option 2 is an operating-system level solution, 
not directly implementable by user-level code. Op- 
tion 3, the hardware solution, is discussed In greater 
detail in section 5.3.4. 


5.3.4 STRONG ORDERiNG MODE 

In strong ordering mode, the processor delays up- 
dates to its Internal data cache in either of these 
conditions: 

1 . The internal write buffer is not empty. 

2. An external write buffer is not empty (the external 
system signals this condition by deactivating the 
EWBE# signal). 


In strong ordering mode, EWBE# can be deassert- 
ed only between the ADS# and the last BRDY# of 
a store. The earliest deassertion Is the clock after 
ADS#; the latest deassertion Is together with the 
last BRDY#. EWBE# can be reasserted at any 
time, except when the processor is performing an 
Inquiry write-back. In other words, EWBE# must not 
activate while HITM# is active. When EWBE# goes 
active, the processor completes any cache update 
that may have been delayed by its deassertion. 

Figure 5.27 shows how an external cache can use 
EWBE# when a store miss in the i860 XP micro- 
processor Is also a miss In the external cache. 

An external cache controller should also refrain from 
updating the external cache while EWBE# Is active. 


By delaying the cache update until all write buffers 
are empty, the i860 XP microprocessor avoids the 
out-of-order sequence shown in section 5.3.3. 
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5.3.5 SCHEDULING INQUIRY WRITE-BACK 
CYCLES 

In order to preserve system-wide ordering of memo- 
ry transactions in multiprocessor systems that have 
a pipelined or split-transaction memory bus, it may 
be necessary to get the data corresponding to an 
inquiry hit before outstanding bus cycles are com- 
pleted. Another bus master can always request an 
inquiry while the i860 XP microprocessor has cycles 
outstanding on the bus. However, when AHOLD is 
asserted, the i860 XP microprocessor normally com- 
pletes outstanding cycles before It performs any 
write-back that may be required. The i860 XP micro- 
processor provides two methods for causing the in- 
quiry write-back before outstanding cycles are com- 
pleted: 

FLINE# When FLINE# is asserted during the 
EADS# of an inquiry that hits an M-state 
line, the i860 XP microprocessor Issues a 
write-back cycle and writes the dirty line to 
memory before the outstanding bus cycles 
are completed. 

BOFF# If there are outstanding cycles on the bus, 
asserting BOFF# clears the bus pipeline. 
If an inquiry causes HUM# to be asserted, 
then the first cycle Issued by the i860 XP 
microprocessor after deassertion of 
BOFF # is the inquiry write-back cycle. Af- 
ter the inquiry write-back, it reissues the 
aborted cycles. 

5.3.5.1 Choosing between FLINE# and BOFF# 

FLINE#, although the more efficient choice, cannot 
handle all situations. Under certain circumstances, it 
can happen that outstanding stores on the bus cor- 


respond to data that is obsolete relative to the data 
In the cache, because a subsequent store has up- 
dated the cache after the ADS# for the outstanding 
store has occurred. For example: 

• An aliasing store hit, In which a cache virtual-tag 
miss occurs and the ADS# Is issued at the same 
time as a physical-tag hit. Then the cached data 
would be updated before external memory, and a 
subsequent store to the new virtual address 
could also update cache before the outstanding 
bus store completed. 

• Back-to-back writes to the same line can also up- 
date the cache more recently than the bus when 
the write-once update policy is employed. The 
first write updates the cache and generates a bus 
write request, but the second write only updates 
the cache. 

In both of these examples the outstanding stores on 
the bus are obsolete relative to the data in the cache 
line. If an inquiry cycle hits a line and this line is 
written back out of order (that is, before outstanding 
stores are completed), special care should be taken 
to discard the outstanding stores. 

The easiest way to avoid this situation is not to as- 
sert FLINE# when stores are outstanding, but use 
BOFF# Instead. If out-of-order write-back Is imple- 
mented with BOFF#, the I860 XP microprocessor 
does not restart the outstanding store to that line If 
such a store has been obsoleted by a later cache hit 
store. That is, the I860 XP microprocessor detects 
this condition and kills the obsolete data. However,, 
lock-bracketed stores (including the last store in the 
lock sequence) are restarted by the I860 XP micro- 
processor, because lock-bracketed stores update 
the cache only after BRDY# is returned. 
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If, on the other hand, out-of-order write-back is im- 
plemented by using only the FLINE# pin, the exter- 
nal system must return BRDY#s for outstanding 
stores, but the data must be ignored if it has already 
been written out by an Inquiry write-back. 

Note that if a replacement write-back Is in progress 
(ADS# has been issued, but last BRDY# has not 
occurred) and an inquiry hits the same line that is 
being written back, the FLINE# pin is Ignored. The 
system can recognize this special case by the fact 
that HUM# is asserted while HIT# Is deasserted. If 
other cycles are outstanding and it is necessary to 
write the line back before the other cycles, BOFF# 
can be used. 


5.3.5.2 Reordering Write-Backs with FLINE# 

FLINE# must be active during the EADS# that initi- 
ates an inquiry. BRDY# must not be asserted forthe 
□rfivinijslv issiipH c'^cles while HITM# is active. !f 
HUM# is asserted while the data transfer of the 
outstanding cycle Is In progress (I.e. first BRDY# 
has been asserted, but the entire transfer has not 


yet been completed), the i860 XP microprocessor 
waits for the current cycle to complete, and only 
then issues the write-back. After the last BRDY # for 
the ongoing burst (if any), BRDY# Is Ignored until 
the clock period after ADS# is asserted for the 
write-back. 

From the viewpoint of the I860 XP microprocessor, 
an inquiry write-back cycle is just another bus cycle; 
so. If there Is an outstanding cycle at the time of 
FLINE# and HUM# activation, the system must as- 
sert NA# to initiate the write-back. 

Figure 5.28 illustrates simple cycle reordering, when 
FLINE# is not asserted during the data transfer of 
another cycle. The outstanding request could be ei- 
ther a read or write. 

Figure 5.29 shows the case in which FLINE# is as- 
serted after data transfer for the outstanding cycle 
has already started. In this case, the i860 XP micro- 
processor does not Issue a write-back until the out- 
standing transfer is completed. NA# is needed in 
this example only if other outstanding cycles remain. 
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5.3.5.3 Reordering Write-Backs with BOFF# 

Back-off cycles are discussed in general in Section 
5.2.2. Figure 5.30 shows how BOFF# can be used 
to cancel outstanding cycles so that an inquiry write- 
back can take place immediately. 


5.4 The LOCK# Cycle Attribute 

The processor asserts the LOCK# signal when sev- 
eral accesses to a single memory location must be 
effectively uninterruptible. By causing LOCK# to be 
asserted, a programmer can, for example. Increment 
the contents of a memory variable and be assured 
that the variable will not be accessed between the 
read and the update of that variable. 
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NOTES: 

A Outstanding cycle (for example, noncacheable 128-bit read) W Write-back cycle 

1 . AHOLD begins an inquiry while one cycle is outstanding. 

2. Earliest assertion of EADS# is two clocks after assertion of AHOLD 

3. Inquiry hits modified line. 

4. Assertion of BOFF# aborts the outstanding cycle. 

5. BRDY# asserted during BOFF# is ignored by CPU. 

6. Write-back begins after deassertion of BOFF#. 

7. Earliest assertion of ADS# for restart of cycle A (assuming no pipelining). 


Figure 5.30. Cycle Reordering via BOFF# (Ongoing Burst) 


2-84 




i 860 TM XP MICROPROCESSOR 


[Pl^ilLOIMflKlW 


iny. 


The memory location to be locked is the one whose 
address is driven during the cycle in which LOCK# 
is first activated. In multiprocessor systems, external 
hardware should guarantee that no other processor 
Is granted a locked read, locked write, or unlocked 
write to the same location until LOCK# is deassert- 
ed. The I860 XP microprocessor has no hardware 
provision to prevent another master from also lock- 
ing the variable; this responsibility falls on the bus 
arbiter. In the simplest implementation, the arbiter 
can globally prevent other masters from accessing 
the bus. 

Not all cycles affect the value of LOCK#. Code 
fetches, write-backs due to replacement or inquiry, 
and cycles restarted due to BOFF# do not affect 
LOCK#. Any other type of cycle can be used to initi- 
ate or terminate LOCK#, Including cache line fills, 
interrupt acknowledge, I/O, and special cycles. 

Data accesses with LOCK# asserted are not pipe- 
lined, and other data cycles are not pipelined while a 
LOCK# cycle remains outstanding. Instruction 
fetches, however, may be pipelined during lock. 

The i860 XP microprocessor can run very long lock 
sequences; therefore, to guarantee reasonable bus 
turnover latency in multimaster systems, the i860 XP 


microprocessor recognizes bus hold (HOLD), ad- 
dress hold (AHOLD), and back-off (BOFF#) while 
the LOCK# signal is active. In spite of such inter- 
vening conditions, the arbiter should prevent any 
other bus master from also locking or updating the 
variable the I860 XP microprocessor locked. In sim- 
ple systems the HOLD input can be masked by the 
LOCK# output (that Is, the external logic that gener- 
ates HOLD can AND the LOCK# signal with other 
hold conditions). More sophisticated systems, how- 
ever, may allow the bus to be turned over while 
LOCK# is asserted. 

Whatever the lock implementation, arbiter design 
must, in one case, allow another processor to write 
the locked variable. That case is when another 
I860 XP microprocessor or master asserts HITM# in 
response to the Inquiry generated by the locking 
processor’s Initial read. That other master must write 
back the locked variable before the i860 XP micro- 
processor can read it. This HITM# write-back must 
always be allowed. 

The timing of LOCK# is shown in Figure 5.31. Note 
that LOCK# is asserted In the same clock period as 
ADS# for the locked address, but is deasserted in 
the clock period after ADS# for the unlocking load 
or store. 


CLK 

LEN 


1 , 2 , 3 , 4 , 5 , 6 , 7 




CACHE# 

W/R# 

ADS# 

ADDRESS 

BRDY# 

DATA 

LOCK# 
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NOTES: 

L Locking access 
U Unlocking access 

1 . This address is to be locked 

2. LOCK# is asserted with ADS# 

3. LOCK# is deasserted one clock after ADS# 


Figure 5.31. LOCK# Timing 
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5.5 RESET initialization 

Initialization of the i860 XP microprocessor is caused 
when the system asserts the RESET signal for at 
least ten clocks. Table 5.5 shows the status of out- 
put pins during the time that RESET is asserted. 
Note that the bidirectional data pins (D63-D0 and 
DP7-DP0) are floated during RESET, though the bi- 
directional A31 -A3 pins are not. If the i860 XP mi- 
croprocessor is used with 82495XP and 82496XP 
cache, however, the latter do float the bidirectional 
pins they share with i860 XP microprocessor during 
RESET. Note that HOLD requests are honored dur- 
ing RESET and that the HLDA output signal may 
also become active. The status of output pins de- 
pends on whether a HOLD request is being acknowl- 
edged. Note also that the test logic may be active 
during RESET and that the EXTEST instruction may 
drive other values on the output pins. 

After the RESET signal goes inactive the processor 
rcrT'iCiPiG in the RESET state for three rriore clocks. 
Applications that use the HOLD signal to float the 


bus during RESET should keep HOLD active for 
three more clocks after the RESET signal is deacti- 
vated. 

Some aspects of processor configuration are deter- 
mined by asserting input signals during RESET. To 
select a given option, the corresponding input must 
be asserted for at least the last three clocks before 
the falling edge of RESET; to deselect, the corre- 
sponding input must be deasserted for at least the 
last three clocks before the falling edge of RESET: 
EWBE# Enter strong ordering mode. 

FLINE# Enter one clock late back-off mode. 
INT/CS8 Enter eight-bit code-size mode. 

PEN# Enter normal (small output buffers) cur- 
rent mode. 

Figure 5.32 shows how configuration pins are sam- 
pled during the three clock periods just before the i 
falling edge of RESET. No inputs besides EWBE#. 
HOLD, FLINE#, INT/CS8, and PEN# are sampled 
during RESET. 


Table 5.5. Output Pin Status during Reset 


Pin Name 

Pin Value 

HOLD 

Not Acknowledged 

HOLD 

Acknowledged 

BREQ 

LOW 

LOW 

HLDA 

LOW 

HIGH 

W/R#,PWT, PCD 

LOW 

Tristate OFF 

ADS# 

HIGH 

Tristate OFF 

D63-D0, DP7-DP0 

Tristate OFF 

Tristate OFF 

A31-A3, BE7#0-BE0#, NENE# CACHE#, CTYP, D/C#, 

Undefined 

Tristate OFF 

KBO, KB1, LEN, M/IO#, PCYC 



PCHK#,HlT# 

Undefined 

Undefined 

HITM#,LOCK# 

HIGH 

HIGH 


NOTE: 

This table does not apply if the test logic is running the EXTEST instruction. 
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While in eight-bit code-size mode, instruction cache 
misses are one-byte reads (transferred on D7-D0 of 
the data bus) instead of eight-byte reads. This allows 
the i860 XP microprocessor to be bootstrapped from 
an eight-bit ROM. For these code reads, byte en- 
ables BE2#-BE0# are redefined to be the low or- 
der three bits of the address, so that a complete 
byte address is available. The entire eight-byte data 
bus continues to be parity-checked by the i860 XP 
microprocessor during CS8-mode instruction fetch- 
es; therefore, external hardware must either gener- 
ate good parity on all eight bytes or disable parity 
traps by deasserting PEN# during CS8 mode. 

While in this mode, instructions must reside in an 
eight-bit wide memory, while data must reside in a 
separate 64-bit wide memory. After the code has 
been loaded into 64-bit memory, initialization code 
can initiate 64-bit code fetches by clearing the CS8 
bit of the dirbase register (refer to section 2). Once 
eight-bit code-size mode Is disabled by software, it 
cannot be reenabled except by resetting the i860 XP 
microprocessor. 

Instruction fetches in CS8 mode update the instruc- 
tion cache if KEN# is asserted during NA# or all of 
the first eight BRDY#s (refer to section 4.2.26). 
They are pipelined if NA# is asserted. When used 
with the 82495XP and 82496XP cache, CS8 mode 
works only If the ROM locations are made non- 
cacheable. 


6.0 TESTABILITY 

The i860 XP microprocessor provides testability fea- 
tures compatible with the proposed Standard Test 
Access Port and Boundary-Scan Architecture (IEEE 
Std. P1149.1/D6). The subset of the standard test 
logic implemented in the i860 XP microprocessor 
provides for testing the interconnections between 
the i860 XP microprocessor and other integrated cir- 
cuits once they have been assembled onto a printed 
circuit board. 

The test logic consists of a boundary-scan register 
and other building blocks that are accessed through 
a test access port (TAP). The TAP provides a simple 
serial interface that makes it possible to test all sig- 
nal traces with only a few probes. 

The TAP can be controlled by a bus master. The bus 
master can be either automatic test equipment or a 
component that interfaces to a four-pin test bus. 


6.1 Test Architecture 


The test logic contains the following elements: 

» Test access port (TAP), which consists of input 
pins TMS, TCK, TDI, and TRST #; and output pin 
TOO. 


• TAP controller, which receives the dedicated test 
clock (TCK) and interprets the signals on the test 
mode select (TMS) line. The TAP controller gen- 
erates clock and control signals for the instruc- 
tion and test data registers and for other parts of 
the test logic. 


Instruction register (IR), which allows instruction 
codes to be shifted into the test logic. The in- 
struction codes are used to select the test to be 
performed or the test data register to be ac- 
cessed. 

Test data registers: Bypass Register (BPR), De- 
vice Identification Register (DID), and Boundary- 
Scan Register (BSR). 



The instruction and test data registers are separate 
shift-register paths connected in parallel and having 
a common serial data input and a common serial 
data output connected to the TAP TDI and TDO sig- 
nals respectively. 


6.2 Test Data Registers 

The test logic contains the following data registers: 

« Bypass Register (BPR): BPR is a one-bit shift 
register that provides a minimum-length path be- 
tween TDI and TDO when no test operation of 
the component is required. This allows more rap- 
id movement of test data to and from other board 
components that are required to perform test op- 
erations. While running through BPR, the data is 
transferred without inversion from TDI to TDO. 

o Device Identification Register (DID): This reg- 
ister contains the manufacturer’s identification 
code, part number code, and version code in the 
format shown by Figure 6.1 . The values are: man- 
ufacturer’s identification code (9), part number 
code (61 AO), version code (8), entire 32-bit value 
(0x861 A001 3). 

• Boundary Scan Register (BSR): The BSR Is a 

single shift-register path containing 1 50 cells that 
are connected to all input and output pins of the 
i860 XP microprocessor. Figure 6.2 shows the 
logical structure of the BSR. Input cells only cap- 
ture data; they do not affect operation of the 
i860 XP microprocessor. Data is transferred with- 
out inversion from TDI to TDO through the BSR 
during scanning. The BSR can be operated by 
the EXTEST and SAMPLE instructions. 
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Figure 6.1. Format of DID Register 



Figure 6.2. Logical Structure of BSR Register 


6 3 Instruction Reoister EXTEST The BSR cells associated with output pins 

^ drive the output pins of the i860 XP micro- 

The Instruction Register (IR) selects the test to be processor. Values scanned Into the BSR 

performed and the test data register to be accessed. cells become the output values. The BSR 

It is four bits wide, with no parity bit. Table 6.1 shows cells associated with input pins sample 

the encoding of the instructions supported by the the inputs of the I860 XP microprocessor. 

TAP controller of the i860 XP microprocessor. The Note that I/O pins can be input or output 

rightmost bit is the least significant and is the first for this test, depending on their control 

shifted out on TDO. setting. The values shifted to the input 

latches are not used by the internal logic 
Table 6.1. TAP Instruction Encoding of the i860 XP microprocessor. After use 

of the EXTEST command, the i860 XP mi- 
croprocessor must be reset (with the RE- 
SET signal) before normal use. 

SAMPLE The BSR cells associated with output pins 
sample the value driven by the i860 XP 
microprocessor. BSR cells associated 
with input pins sample on the rising edge 
of TCK the values driven to the i860 XP 

* CAUTION: Operation of these private instructions may 
cause damage to the component. 
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microprocessor. BSR cells associated 
with I/O pins sample the value on the re- 
spective pin. The I/O pin can be driven by 
the i860 XP microprocessor or by external 
hardware. The values shifted to the input 
latches are not used by the internal logic 
of the i860 XP microprocessor. 

IDCODE The identification code of the i860 XP mi- 
croprocessor from the DID register is 
passed to TDO. The DID register is not 
altered by data shifted in on TDI. 

BYPASS Test data is passed from TDI to TDO via 
the single-bit BPR, effectively bypassing 
the test logic of the i860 XP microproces- 
sor. Because of its special encoding, this 
instruction can be entered by holding TDI 
HIGH while completing an instruction- 
scan cycle. This reduces the demands on 
the host test system in cases where ac- 
cess is required, for example, only to chip 
57 on a 1 00-chip board. 

Note that an open circuit fault in the 
board-level test data path causes the 
BPR register to be selected following an 
instruction-scan cycle, because the TDI 
input has a pull-up resistor. Therefore, no 
unwanted interference with the operation 
of the on-chip system logic can occur. 

Table 6.2 defines which registers are active during 

execution of each instruction. 


6.4 TAP Controller 


The value of the TMS input signal at a rising edge of 
TCK controls the sequence of state changes. The 
state diagram for the TAP controller is shown in Fig- 
ure 6.3. Test designers must consider the operation 
of the state machine in order to design the correct 
sequence of values to drive on TMS. 


6.4.1 TEST-LOGIC-RESET STATE 


In this state, the test logic is disabled so that normal 
operation of the i860 XP microprocessor can contin- 
ue unhindered. This is achieved by initializing the in- 
struction register such that the IDCODE instruction 
is loaded. No matter what the original state of the 
controller, the controller enters Test-Logic-Reset 
when the TMS input is held HIGH for at least five 
rising edges of TCK. The controller remains in this 
state while TMS is HIGH. 

If the controller leaves the Test-Logic-Reset state as 
a result of an erroneous LOW signal on the TMS line 
at the time of a rising edge of TCK (for example, a 
glitch due to external interference), it returns to the 
Test-Logic-Reset state following three rising edges 
of TCK while the TMS signal at the intended HIGH 
logic level. The operation of the test logic is such 
that no disturbance is caused to on-chip system log- 
ic operation as the result of such an error. On leav- 
ing the Test-Logic-Reset state, the controller moves 
into the Run-Test/idie state, where no action occurs 
because the current instruction has been set to se- 
lect operation of the DID register. The test logic is 
also inactive In the Seiect-DR-Scan and Select-iR- 
Scan states. 


2 


The TAP Controller is a synchronous, finite state 
machine. It controls the sequence of operations of 
the test logic. The TAP Controller changes state 
only in response to the following events: 

1. A rising edge of TCK. 

2. A transition to logic zero at the TRST # input. 

3. Power-up. 


The TAP controller is also forced to the Test-Logic- 
Reset state by applying a LOW logic level to the 
TRST # input and at power-up. 


Table 6.2. Registers Active by Instruction 


Mode 

Register 

BSR 

DID 

BPR 

EXTEST 

SAMPLE 

IDCODE 

BYPASS 

TDI BSR TDO 

TDI -> BSR TDO 
Inactive 

Inactive 

Inactive 

Inactive 

DID -> TDO 
Inactive 

Inactive 

Inactive 

Inactive 

TDI -> BPR ^ TDO 
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NOTE: 

0,1 The values present on TMS at the time of a rising edge on TCK. 


Figure 6.3. TAP Controller State Diagram 


6.4.2 RUN-TEST/IDLE STATE 

The controller enters this state between scan opera- 
tions. Once in this state, the controller remains in 
this state as long as TMS is held LOW. No activity 
occurs in the test logic. The instruction register and 
all test data registers retain their previous state. 
When TMS is HIGH and a rising edge is applied to 
TCK, the controller moves to the Select-DR-Scan 
state. 


6.4.3 SELECT-DR-SCAN STATE 

This is a temporary controller state. The test data 
register selected by the current instruction retains Its 
previous state. If TMS Is held LOW and a rising edge 
is applied to TCK when in this state, the controller 
moves into the Capture-DR state, and a scan se- 
quence for the selected test data register is initiated. 
If TMS is held HIGH and a rising edge is applied to 
TCK, the controller moves to the Select-IR-Scan 
state. 


The Instruction does not change in this state. 
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6.4.4 SELECT-IR-SCAN STATE 


6.4.8 PAUSE-DR STATE 


This is a temporary controller state. The test data 
register selected by the current instruction retains its 
previous state. If TMS is held LOW and a rising edge 
is applied to TCK when in this state, the controller 
moves Into the Capture-IR state, and a scan se- 
quence for the instruction register is initiated. If TMS 
is held HIGH and a rising edge Is applied to TCK, the 
controller moves to the Test-Logic-Reset state. 

The Instruction does not change in this state. 


The pause state allows the test controller to tempo- 
rarily halt the shifting of data through the test data 
register In the serial path between TDI and TOO. 
This might be necessary, for example, to allow the 
tester to reload Its pin memory from disk during ap- 
plication of a long test sequence. 

The test data register selected by the current In- 
struction retains Its previous state. The instruction 
does not change in this state. 


6.4.5 CAPTURE-DR STATE 

In this state, the BSR captures input pin data If the 
current instruction is EXTEST or SAMPLE. The other 
test data registers, which do not have parallel input, 
are not changed. 


The controller remains In this state as long as TMS 
is LOW. When TMS goes HIGH and a rising edge is 
applied to TCK, the controller moves to the Exit2-DR 
state. 


6.4.9 EXIT2-DR STATE 



The instruction does not change In this state. 

When the TAP controller is in this state and a rising 
edge Is applied to TCK, the controller enters the 
Exit1-DR state if TMS is HIGH or the Shift-DR state 
if TMS is LOW. 


This is a temporary state. If TMS is held HIGH and a 
rising edge is applied to TCK, the scanning process 
terminates, and the TAP controller enters the 
Update-DR state. If TMS is held LOW and a rising 
edge is applied to TCK, the controller enters the 
Shift-DR state. 


6.4.6 SHIFT-DR STATE 

In this controller state, the test data register con- 
nected between TDI and TDO as a result of the cur- 
rent instruction shifts data one stage toward Its serial 
output on each rising edge of TCK. 

The instruction does not change in this state. 

When the TAP controller is in this state and a rising 
edge Is applied to TCK, the controller enters the 
Exitl-DR state If TMS Is HIGH or remains In the 
Shift-DR state if TMS is LOW. 


6.4.7 EXIT1-DR STATE 

This is a temporary state. If TMS is held HIGH, a 
rising edge applied to TCK while in this state causes 
the controller to enter the Update-DR state, which 
terminates the scanning process. If TMS is held low 
and a rising edge is applied to TCK, the controller 
enters the Pause-DR state. 

The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change in this state. 


The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change In this state. 

6.4.10 UPDATE-DR STATE 

The BSR register is provided with a latched parallel 
output to prevent changes at the parallel output 
while data Is shifted In response to the EXTEST and 
SAMPLE instructions. When the TAP controller Is In 
this state and the BSR register is selected, data is 
latched onto the parallel output of this register from 
the shift-register path on the falling edge of TCK. 
The data held at the latched parallel output does not 
change other than in this state. 

All shift-register stages In test data registers select- 
ed by the current instruction retain their previous 
state unchanged. The instruction does not change in 
this state. 

When the TAP controller is in this state and a rising 
edge is applied to TCK, the controller enters the 
Select-DR-Scan state If TMS is held HIGH or the 
Run-Test/ idle state if TMS is held LOW. 


6.4.11 CAPTURE-IR STATE 

In this controller state the shift register contained in 
the instruction register loads the fixed value 0001 on 
the rising edge of TCK. 
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The test data register selected by the current in- 
struction retains its previous state. The instruction 
does not change in this state. 

When the controller is in this state and a rising edge 
is applied to TCK, the controller enters the Exit1-IR 
state if TMS is held HIGH or the Shift-1 R state if TMS 
is held LOW. 


6.4.12 SHIFT-IR STATE 

In this state, the shift register contained in the in- 
struction register is connected between TDI and 
TOO and shifts data one stage towards its serial out- 
put on each rising edge of TCK. 

The test data register selected by the current In- 
struction retains its previous state. The instruction 
does not change in this state. 

When the controller is in this state and a rising edge 
lb applied io TCK, the comroiier enters the txit1-IR 
state if TMS Is held HIGH or remains In the Shift-IR 
state if TMS is held LOW. 


6.4.13 EXIT1-IR STATE 

This is a temporary state. If TMS Is held HIGH, a 
rising edge applied to TCK while in this state causes 
the controller to enter the Update-IR state, which 
terminates the scanning process. If TMS Is held low 
and a rising edge is applied to TCK, the controller 
enters the Pause-IR state. 

The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change In this state, and the 
instruction register retains Its state. 

6.4.14 PAUSE-IR STATE 

This state allows the shifting of the instruction regis- 
ter to be temporarily halted. 

The test data register selected by the current In- 
struction retains its previous state. The instruction 
does not change in this state, and the Instruction 
register retains its state. 

The, controller remains in this state as long as TMS 
is LbW. When TMS goes HIGH and a rising edge Is 
applied to TCK, the controller moves to the Exit2-IR 
state. 


6.4.15 EXIT2-IR STATE 

This is a temporary state. If TMS is held HIGH and a 
rising edge Is applied to TCK, the scanning process 


terminates, and the TAP controller enters the 
Update-IR state. If TMS is held LOW and a rising 
edge Is applied to TCK, the controller enters the 
Shift-IR state. 

The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change in this state, and the 
instruction register retains Its state. 

6.4.16 UPDATE-IR STATE 

The instruction shifted into the instruction register is 
latched onto the parallel output from the shift-regis- 
ter path on the falling edge of TCK. Once the new 
Instruction has been latched, it becomes the current 
instruction. 

Test data registers selected by the current instruc- 
tion retain the previous state. 


6.5 Boundary Scan Register Cell 
Ordering 

Figure 6.4 shows the order of cells in the BSR. 
There are 1 50 cells including TOO. TDI is not a BSR 
cell. 

The DCTL, ACTL, TCTL, and OCTL cells do not cor- 
respond to pins of the i860 XP microprocessor; rath- 
er, they control the bidirectional and tristate pins: 

DCTL D63-D0, DP7-DP0 
ACTL A31-A3 

TCTL Tristate outputs: ADS#, BE7#-BE0#, 
CACHE#, CTYP, D/C#, KBO, KB1, LEN, 
M/IO#, NENE#, PCD, PCYC, PWT, W/R# 

OCTL Outputs not floated In normal operation: 
BREQ, HIT#, HITM#, HLDA, LOCK#, 
PCHK# 

If a value of one is loaded into any of these control 
latches, the associated pins will not drive the exter- 
nal bus while running EXTEST. 

The values of DCTL, ACTL, TCTL, and OCTL are 
undefined during the SAMPLE instruction. 

The values and direction of I/O and outputs do not 
change during the scanning process (that is, during 
Shift-DR states). They only change after scanning is 
completed (in the Update-DR state). 

The decision table. Table 6.3, defines how the 
boundary scan instructions EXTEST and SAMPLE/ 
PRELOAD utilize BSR. 
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Figure 6.4. Boundary Scan Register Ordering 
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6.6 TAP Controller Initialization 

TAP can be initialized by applying a high signal level 
on the TMS Input for five periods of TCK or by acti- 
vating the TRST # input pin. TCK does not have to 
be running in order to Initialize TAP with the TRST # 
pin. TRST # is provided with an internal pull-up resis- 
tor; so, even if an open circuit fault occurs, the TAP 
logic can still be used. 


7.0 MECHANICAL DATA 

Figures 7.1 and 7.2 show the locations of pins; Ta- 
bles 7.1 and 7.2 help to locate pin identifiers. 


Table 6.3. Instruction Functions 


Instruction: 

EXTEST 

SAMPLE/PRELOAD 

Control Cell: 

LOW 

HIGH 

LOW 

HIGH 

Input BSR cells . . . 

. . . sample values driven to 
processor by system 

. . . sample values driven to 
processor by system 

Values of input cells 
used by processor? 

NO 

NO 

Output BSR cells . . . 

. . . drive output pins with 
cell values 

. . . sample values driven 
by processor 

Input/output BSR cells: 

Treat as 
output 

Treat as 
input 

Treat as 
output 

Treat as 
input 
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Figure 7.1. ISBOtm xP Microprocessor Pin Configuration — View from Pin Side 
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Figure 7.2. i 860 TM XP Microprocessor Pin Configuration— View from Top Side 
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D40 

D44 

D51 

D55 

Vcc 

Vcc 

Vcc 

Vcc 

O 

O 

O 

O 

O 

O 

o 

O 

D39 

D42 

D49 

D56 

Vss 

Vss 

Vss 

Vss 

O 

0 

O 

O 

o 

O 

o 

o 

D37 

D52 

D46 

050 

062 

TDI 

TRST# 

HIT# 

O 

O 

O 

O 

0 

O 

O 

O 

D35 

D33 

Vcc 

D57 

D60 

063 

IMS 

HOLD 

O 

O 

O 

O 

O 

0 

O 

0 

DP4 

Vss 

Vcc 

054 

D61 




O 

O 

o 

O 

O 




D34 

Vcc 

Vss 

0P6 

059 




O 

o 

o 

O 

O 




D36 

Vcc 

Vss 

D50 

DP7 




O 

0 

o 

O 

O 




D31 

Vss 

Vss 

D47 

053 




O 

0 

0 

O 

0 




^CC 

Vcc 

Vss 

041 

048 




O 

0 

o 

O 

0 




Vcc 

Vss 

Vss 

043 

045 




o 

0 

0 

O 

O 




Vcc 

Vcc 

Vss 

038 

DPS 




o 

o 

o 

O 

0 




Vcc 

Vss 

Vss 

PCHK# 

032 




o 

o 

o 

O 

O 




D29 

Vcc 

Vss 

D30 

028 




O 

o 

o 

O 

O 




D27 

Vcc 

Vss 

D26 

024 




O 

o 

o 

O 

O 




Vcc 

Vss 

D25 

D23 

021 




O . 

o 

o 

O 

O 




D22 

DPS 

Vcc 

D20 

019 

017 

DP2 

014 

O 

o 

o 

O 

O 

O 

O 

O 

D18 

D16 

D7 

D8 

012 

Vss 

Vss 

Vss 

O 

o 

O 

O 

O 

O 

O 

O 

Dt5 

D13 

Oil 

09 

Vss 

Vcc 

Vcc 

Vss 

O 

O 

o 

O 

O 

O 

o 

O 

DPI 

D6 

D4 

03 

DPO 

Vcc 

02 

Vcc 

O 

O 

O 

O 

O 

o 

O 

O 
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Table 7.1. Pin Cross Reference by Location 

Location Signal | Location Signal Location Signal Location Signal 

C15 D12 G18 Vcc N01 Vcc 

C16 D8 G19 D29 N02 Vcc 

C17 D7 H01 Vcc N03 Vss 

Cl 8 D16 H02 Vss N04 ADS# 

C19 D18 H03 Vss N05 HITM# 

D01 A11 H04 A16 N15 DP7 

D02 Vss H05 A20 N16 D50 

D03 Vcc H15 D32 N17 Vss 

D04 A29 H16 .PCHK# N18 Vcc 

□05 BE1# H17 Vss N19 D36 

D06 BE2# H18 Vss P01 VccCLK 

D07 BE6# H19 Vcc P02 Vcc 

D08 EWBE # J01 Vcc P03 Vss 

D09 D1 J02 Vcc P04 RSRVD 

D10 D5 J03 Vss P05 CTYP 

D11 DIO J04 A12 P15 D59 

D12 D14 J05 A14 P16 DP6 

D13 DP2 J15 DPS P17 Vss 

D14 D17 J16 D38 P18 Vcc 

D15 D19 J17 Vss P19 D34 

D16 D20 J18 Vcc Q01 TCK 

D17 .Vnn J19 Vcc 002 Vss 

KOI Vcc Q03 Vcc 

K02 Vss Q04 CACHE# 

K03 Vss Q05 AHOLD 

K04... A10 Q15 D61 

K05 A8 Q16 D54 

K15 D45 Q17 Vcc 

K16 D43 Q18 Vss 

K17 Vss Q19 DP4 

K18 Vss R01 A4 

K19 Vcc R02 Vss 

L01 Vcc R03 Vcc 

L02 Vcc R04 BOFF# 

L03 Vss R05 D/C# 

L04 SPARE R06 PCD 

LOS A6 R07 INV 

L15 D48 R08 PEN# 

L16 D41 R09 .BREQ 

L17 Vss R10 TDO 

L18 Vcc R11 KBO 

L19 Vcc R12... HOLD 

M01 Vcc R13 TMS 

M02 Vss R14 D63 

M03 Vss R15 D60 

M04 CLK R16 D57 

M05 AS R17 Vcc 

M15 D53 R18 D33 

M16 D47 Ri9 D35 

M17 Vss SOI A3 

Ml 8 Vss S02 RESET 

M19 D31 S03 LOCK# 


□05 BE1# 

D06 BE2# 

D07 BE6# 

D08 EWBE# 

D09 D1 

D10 D5 

D11 DIO 

D12 D14 

D13 DP2 

D14 D17 

D15 D19 

D16 D20 

D17 .Vcc 

D18 DP3 

D19 D22 

E01 A9 

E02 Vss 

E03 Vcc 

E04 A27 

EOS BEO# 

E15 D21 

E16 D23 

E17 D25 

El 8 Vss 

E19 Vcc 

F01 A7 

F02 Vcc 

F03 Vss 

F04 A28 

F05 A30 

F15 D24 

F16 D26 

F17 Vss 

F18 Vcc 

F19.. D27 

G01 Vcc 

G02 . . ; Vcc 

G03 Vss 

G04 A22 

G05 A26 

G15 D28 

G16 . D30 

G17 Vss 
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Table 7.1. Pin Cross Reference by Location (Continued) 


Location 

Signal 

Location 

Signal 

Location 

Signal 

Location 

Signal 

S04 

...M/IO# 

S18 

D52 

T13 

Vss 

U08 

Vcc 

S05 

..EADS# 

S19 

D37 

T14 

Vss 

U09 

Vcc 

S06 

.INT/CS8 

T01 

..W/R# 

T15 

Vss 

U10 

Vcc 

S07 

....BERR 

T02 

LEN 

T16 

D56 

U11 

Vcc 

S08 

..FLINE# 

T03 

....PWT 

T17 

D49 

U12 

Vcc 

S09 

....HLDA 

T04 

...PGYC 

T18 

D42 

U13 

.....Vcc 

S10 

KB1 

T05 

Vss 

T19 

.D39 

U14 

Vcc 

S11 

..NENE# 

T06 

Vss 

U01 

...BRDY# 

U15 

.....Vcc 

S12 

HIT# 

T07 

Vss 

U02 

....KEN# 

U16 

D55 

S13 

...TRST# 

T08 

Vss 

U03 

NA# 

U17 

D51 

S14 

TDI 

T09 

Vss 

U04 ..... 

.WB/WT# 

U18 

D44 

S15 

D62 

T10 

Vss 

U05 

Vcc 

U19 

D40 

S16 

D58 

T11 

Vss 

U06 

Vcc 



S17 

D46 

T12 

Vss 

U07 

Vcc 




Table 7.2. Pin Cross Reference by Pin Name 


- 

bignai 

Location 

Signal 

Location 

Signal 

Location 

Signal 

Location 

A3 

SOI 

AHOLD 

Q05 

D13 

B18 

D43 

K16 

A4 

R01 

BEO# 

EOS 

D14 

D12 

D44 

U18 

A5 

M05 

BE1# 

DOS 

D1S 

B19 

D4S 

K1S 

A6 

LOS 

BE2# 

D06 

D16 

C18 

D46 

S17 

A7 

F01 


B04 

D17 

D14 

D47 

M16 

A8 

K05 

BE4# 

C05 

D18 

C19 

D48 

L1S 

A9 

E01 

BES# 

A04 

D19 

D1S 

D49 

T17 

A10 

K04 

BE6# 

D07 

D20 

D16 

DSO 

N16 

A11 

D01 

BE7# 

AOS 

D21 

E1S 

D51 

U17 

A12 

J04 

BERR 

S07 

D22 

D19 

DS2 

S18 

A13 

C01 

BOFF# 

R04 

D23 

E16 

DS3 

M1S 

A14 

JOS 

RSRVD 

P04 

D24 

FIS 

DS4 

Q16 

A15 

B01 

BRDY# 

U01 

D2S 

E17 

DSS 

U16 

A16 

H04 

BREQ 

R09 

D26 

FI 6 

DS6 

T16 

A17 

A01 

CACHE# 

Q04 

D27 

F19 

DS7 

R16 

A18 

C03 

CLK 

M04 

D28 

G1S 

DS8 

S16 

A19 

C02 

CTYP 

POS 

D29 

G19 

DS9 

P1S 

A20 

HOS 

DO 

A07 

D30 

G16 

D60 

R1S 

A21 

B02 

D1 

D09 

D31 

M19 

D61 

Q1S 

A22 

G04 

D2 

A13 

D32 

H1S 

D62 

S1S 

A23 

A02 

D3 

A16 

D33 

R18 

D63 

R14 

A24 

B03 

D4 

A17 

D34 

P19 

D/C# 

ROS 

A25 

A03 

DS 

DIO 

D3S 

R19 

DPO 

A1S 

A26 

GOS 

D6 

A18 

D36 

N19 

DPI 

A19 

A27 

E04 

D7 

C17 

D37 

SI 9 

DP2 

D13 

A28 

F04 

D8 

C16 

D38 

J16 

DP3 

D18 

A29 

D04 

D9 

B16 

D39 

T19 

DP4 

Q19 

A30 

FOS 

D10 

D11 

D40 

U19 

DPS 

J1S 

A31 

C04 

D11 

B17 

D41 

L16 

DP6 

P16 

ADS# 

N04 

D12 

CIS 

D42 

T18 

DP7 

N1S 
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Table 7.2. Pin Cross Reference by Pin Name (Continued) 


Signal 

Location 

Signal 

Location 

Signal 

Location 

Signal 

Location 

EADS# 

S05 

Vcc 

B06 

Vcc 

R17 

Vss 

H17 

FLINE# 

S08 

Vcc 

B07 

Vcc 

U05 

Vss 

H18 

HIT# 

S12 

Vcc 

B09 

Vcc 

U06 

Vss 

J03 

HUM# 

NOS 

Vcc 

B11 

Vcc 

U07 

Vss 

J17 

HLDA 

S09 

Vcc 

B13 

Vcc 

U08 

Vss 

K02 

HOLD 

R12 

Vcc 

B14 

Vcc 

U09 

Vss 

K03 

INT/CS8 

S06 

Vcc 

D03 

Vcc 

U10 

Vss 

K17 

INV 

R07 

Vcc 

D17 

Vcc 

U11 

Vss 

K18 

KBO 

R11 

Vcc 

E03 

Vcc 

U12 

Vss 

L03 

KB1 

S10 

Vcc 

E19 

Vcc 

U13 

Vss 

LI 7 

KEN# 

U02 

Vcc 

F02 

Vcc 

U14 

Vss 

M02 

LEN 

T02 

Vcc 

F18 

Vcc 

U15 

Vss 

M03 

LOCK# 

S03 

Vcc 

G01 

VccCLK 

P01 

Vss 

M17 

M/IO# 

S04 

Vcc 

G02 

Vss 

BOS 

Vss 

M18 

NA# 

U03 

Vcc 

G18 

Vss 

B08 

Vss 

N03 

NENE# 

S11 . 

Vcc 

H01 

Vss 

BIO 

Vss 

N17 

PCD 

R06 

Vcc 

H19 

Vss 

B12 

Vss 

P03 

PCHK# 

H16 

Vcc 

J01 

Vss 

B15 

Vss 

P17 

PCYC 

T04 

Vcc 

J02 

Vss 

C06 

Vss 

002 

PEN# 

R08 

Vcc 

J18 

Vss 

C07 

Vss 

018 

PWT 

T03 

Vcc 

J19 

Vss 

C08 

Vss 

R02 

RESET 

S02 

Vcc 

K01 

Vss 

C09 

Vss 

T05 

SPARE 

L04 

Vcc 

K19 

Vss 

CIO 

Vss 

T06 

EWBE# 

D08 

Vcc 

L01 

Vss 

C11 

Vss 

T07 

BYPASS# 

A06 

Vcc 

L02 

Vss 

C12 

Vss 

T08 

TCK 

Q01 

Vcc 

L18 

Vss 

C13 

Vss 

T09 

TDI 

S14 

Vcc 

L19 

Vss 

C14 

Vss 

T10 

TDO 

R10 

Vcc 

M01 

Vss 

D02 

Vss 

Til 

TMS 

R13 

Vcc 

N01 

Vss 

E02 

Vss 

T12 

TRST# 

S13 

Vcc 

N02 

Vss 

E18 

Vss 

T13 

Vcc 

A08 

Vcc 

N18 

Vss 

F03 

Vss 

T14 

Vcc 

A09 

Vcc 

P02 

Vss 

FI 7 

Vss 

T15 

Vcc 

A10 

Vcc 

P18 

Vss 

G03 

W/R# 

T01 

Vcc 

A11 

Vcc 

003 

Vss 

G17 

WB/WT# 

U04 

Vcc 

A12 

Vcc 

017 

Vss 

H02 



Vcc 

A14 

Vcc 

R03 

Vss 

H03 
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Table 7.3. Ceramic PGA Package Dimension Symbols 


Letter or 

Symbol 

Description of Dimensions 

A 

Distance from seating plane to highest point of body 

Ai 

Distance between seating plane and base plane (lid) 

Aa 

Distance from base plane to highest point of body 

As 

Distance from seating plane to bottom of body 

B 

Diameter of terminal lead pin 

D 

Largest overall package dimension of length 

Di 

A body length dimension, outer lead center to outer lead center 

ei 

Linear spacing between true lead position centerlines 

L 

Distance from seating plane to end of lead 

Si 

Other body dimension, outer lead center to edge of body 


NOTES: 

1. Controlling dimension: millimeter. 

2. Dimension “ei” (“e”) is noncumulative. 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.041 5-0.0430 inch. 

4. Dimensions “B”, "B-i”, and "C” are nominal. 

5. Details of Pin 1 identifier are optional. 
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Family: Ceramic Pin Grid Array Package 

Symbol 

Millimeters 

Inches 

Min 

Max 

Notes 

Min 

Max 

Notes 

A 

3.56 

4.57 


.140 

.180 


A1 

0.64 

1.14 



.045 

Solid Lid 

A2 

2.79 



.110 

.140 

Solid Lid 

A3 


1.40 



.055 


B 

0.43 

0.51 


.017 

.020 


D 

49.28 

49.96 



1.967 


D1 

45.59 



mQ^n 



e1 

2.29 

2.79 


.090 

.110 


L 

2.54 

3.30 


.100 

.130 


N 

240 

280 


240 

280 


SI 

1.52 



.060 

.100 
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Figure 7.3. 262-Lead Ceramic PGA Package Dimensions 
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8.0 PACKAGE THERMAL 
SPECIFICATIONS 

For this section, let: 

P = maximum power consumption 
Tc = case temperature 
T/\ = ambient air temperature 
^CA thermal resistance from case to ambient air 
0JC = thermal resistance from junction to case 

^jA = thermal resistance from junction to ambient 
air 

The i860 XP microprocessor is specified for opera- 
tion when Tc is within the range of 0°C-85°C. Tc may 
be measured in any environment to determine 
whether the i860 XP microprocessor is within speci- 
fied operating range. The case temperature should 
be measured at the center of the top surface oppo- 
site the pins. 

Ta can be calculated from Oca with the following 
equation: 

Ta = Tc - p * 0CA 


Typical values for Oqa at various airflows and for Ojc 
are given In Table 8.1 for the 1.95 sq. in., 262 pin, 
ceramic PGA. 0jc is shown so that 0 ja can be cal- 
culated by: 

^JA = ^JC “ ^CA 

Note that 0jc with a heatsink differs from 0jc with- 
out a heatsink because case temperature is mea- 
sured differently. Case temperature for Ojq with 
heatsink Is measured at the center of the heat fin 
base. Case temperature for Sjc without heatsink is 
measured at the center of the package top surface. 

Table 8.2 shows the maximum Ta allowable (without 
exceeding Tc) at various airflows. 

Note that Ta is greatly improved by attaching “fins” 
or a “heat sink” to the package. P (the maximum 
power consumption) Is calculated by using the maxi- 
mum Ice at 5V as tabulated in the D.C. Characteris- 
tics ot section 9. 

Figure 8.1 gives typical Ice derating with case tem- 
perature. For more Information on heat sinks, mea- 
surement techniques, or package characteristics, re- 
fer to Intel Packaging Handbook, order number 
240800. 


Table 8.1. Thermal Resistance — In °C/Watt 




^CA as a Function of Airflow — ft/min (m/sec) 


0JC 

0 

200 

400 

600 

800 

1000 



<0) 

(1.01) 

(2.03) 

(3.04) 

(4.06) 

(5.07) , 

With Heat Sink* 

1.6 

10.1 

6.3 

4.3 

3.2 

2.5 

2.2 

Without Heat Sink 

1.0 

13.5 

11.0 

8.0 

6.5 

5.5 

5.0 


NOTE: 

* Nine-fin, unidirectional heat sink (fin dimensions: 0.250" height, 0.040" fin width, 0.100" center-to-center spacing, 1.730" 
length) 


Table 8.2. Maximum Ta at Various Airflows — In °C 




Airflow — ft/min (m/sec) 


<CLK 

(MHz) 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

Ta with 
Heatsink* 

50 

24 

47 

59 

66 

70 

72 

Ta without 
Heat Sink 

50 

4 

19 

37 

46 

52 

55 

Ta with 

Heat Sink* 

40 

34.5 

53.5 

63.5 

69 

72.5 

74 

Ta without 
Heat Sink 

40 

17.5 

30 

45 

52.5 

57.5 

60 


NOTE: 

* Nine-fin, unidirectional heat sink (fin dimensions: 0.250" height, 0.040" fin width, 0.100" center-to-center spacing, 1.730" 
length) 
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9.0 ELECTRICAL DATA 

All input and output timings are specified relative to 
the 1 .5V level of the rising edge of CLK and refer to 
the point that the signal reaches 1 .5V. 
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9.1 Absolute Maximum Ratings 


Case Temperature Tc under Bias 0®C to 85°C 

Storage Temperature - 65®C to + 1 50°C 

Voltage on Any Pin with 

Respect to Ground -0.5 to Vcc+ 0.5V 


NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


9.2 D.C. Characteristics 


Table 9.1. D.C. Characteristics Operating Conditions: Vcc = 5V ±5%; Jc = 0°C to 85°C 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

V|L 

Input LOW voltage (TTL) 

-0.3 

+ 0.8 

V 


V|H 

Input HIGH voltage (TTL) 

2.0 

Vcc + 0.3 

V 


V|HC 

ULK input HIGH (TTL) 

2.5 

Vcc + 0.3 

V 


VoL 

Output LOW voltage (TTL) 


0.45 

V 

1 

VOH 

Output HIGH voltage (TTL) 

2.4 


V 

2 

icc 

Power supply current (@ 50 MHz) 


1.2 

Amp 

3 

icc 

Power supply current (@40 MHz) 


+0 

Amp 

3 

Ili 

Input leakage current 


±15 

fjiA 

4 

•up 

Input leakage current (pull-up) 


-400 

fxA 

5 

Ilo 

Output leakage current 


±15 

jliA 

6 

Qn 

Input capacitance 


11.5 

PF 

7 

Co 

I/O or output capacitance 


14 

PF 

7 


NOTES: 

1 . This parameter is measured with current load of 5 mA. 

2. This parameter is measured with current load of 1 mA. Typical value is Vcc “ 0.45V. 

3. Measured at 50 MHz and Vcc = 5V. 

4. This parameter is for inputs without pullups. Vcc is on, and OV ^ V|n ^ Vcc- 

5. This parameter is for inputs with pullups and V|l = 0.45V. Note that if the pull-ups are put in high-impedance state via the 
DCTL boundary scan cell that also tri-states the data outputs, then the leakage is ±15 jiA. 

6. 0.45V ^ V|N ^ Vcc - 0-45V. 

7. These parameters are not tested; they are guaranteed by design characterization. 
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9.3 A.C. Characteristics 

Table 9.2. A.C. Characteristics 

Cl = 0 pF Unless Otherwise Specified; Vcc = 5V ± 5%; Tc = 0°C to 85°C 


Symbol 

Parameter 

Fig 

40 MHz 

50 MHz 

Notes 

Min 

(ns) 

Max 

(ns) 

Min 

(ns) 

Max 

(ns) 

tc 

CLK Period 

9.1 

25 

40 

20 

40 


ttc 

TCK Period 

9.2 

40 

1000 

40 

1000 



CLK Stability 

9.1 


0.1% 


0.1% 


tch 

CLK High Time 

9.1 

7 


7 



tcl 

CLK Low Time 

9.1 

7 


7 



tr 

CLK Rise Time 

9.1 


3 


3 

h 

tf 

CLK Fall Time 

9.1 


3 


3 

h 

ts 

TCK to CLK Skew 

9.3 


±1 


±1 

1 

ttch 

TCK High Time 

9.2 

10 


10 



ttcl 

TCK Low Time 

9.2 

10 


10 



ttcr 

TCK Rise Time 

9.2 


4 


4 


ttcf 

TCK Fall Time 

9.2 


4 


4 


tsu.1 

RESET, HOLD, BERR, FLINE#, 
PEN#, INT/CS8 Setup Time 

9.1 

8 


7 



tsu.2 

BOFF#, AHOLD, KEN#, NA#, 

INV, WB/WT# Setup Time 

9.1 

8 


7 



tsu.3 

EADS# Setup Time 

9.1 

9 


8 



tsu.4 

EWBE# Setup Time 

9.1 

8.5 


7.5 



tsu.5 

BRDY# Setup Time 

9.1 

8.5 


7.5 



tsu.6 

D63-D0, DP7-DP0 Setup Time 

9.1 

8.5 


7.5 



tsu.7 

D63-D0, DP7-DP0 Setup Time 
(Late Backoff Mode) 

9.1 

5.5 


4.5 



tsu.8 

A31 -A5 Setup Time 

9.1 

11 


10 



ttsu 

TDI, TMS, TRST # Setup Time 

9.2 

8 


8 



tth 

TDI, TMS, TRST# Hold Time 

9.2 

2 


1 


' b 

th.1 

Hold Time, All Inputs 
except D63-D0, DP7-D0 

9.1 

2 


1 


c 

th.2 

D63-D0, DP7-DP0 Hold Time 
(Normal and Late Back-Off Mode) 

9.1 

3 


2 


c 

ttco 

TDO Valid Delay and All Outputs 
Valid Delay in EXTEST Mode 

9.2 

1.5 

17.5 

1.5 

16.5 

a,f 

tco.1 

A31-A22 Valid Delay 

9.1 

1.5 

12 

1.5 

11 

a 

tco.2a 

A21 -A3 Valid Delay 
(High Current Mode) 

9.1 

1.5 

11.5 

1-5 

10.5 

a.g 

tco.2b 

A21 -A3 Valid Delay 
(Normal Current Mode) 

9.1 

1.5 

12 

1.5 

11 

a 


2-105 


iny. 


i 860 TM XP MICROPROCESSOR [P[^11!LD[MDK1A[^V 


Table 9.2. A.C. Characteristics (Continued) 

Cl = 0 pF Unless Otherwise Specified; Vcc = 5V ±5%; Tq = 0°C to 85®C 


Symbol 

' 

Parameter 

Fig 

40 MHz 

50 MHz 

Notes 

Min 

(ns) 

Max 

(ns) 

Min 

(ns) 

Max 

(ns) 

tco.3 

D63-D0, DP7-DP0 Valid Delay 

9.1 

2.5 

14 

2.5 

13 

a, d 

tco.4 

BREQ, HLDA, PCHK#, 

NENE#, KBO, KB1 Valid Delay 

9.1 

1.5 

13 

1.5 

12 

a 

tco.Sa 

ADS# Valid Delay 
(High Current Mode) 

9.1 

1.5 

10 

1.5 

9 

a,g 

tco.Sb 

ADS# Valid Delay 
(Normal Current Mode) 

9.1 

1.5 

11 

1.5 

10 

a 

tco.6a 

W/R# Valid Delay 
(High Current Mode) 

9.1 

1.5 

11 

1.5 

10 

a,g 

tco.6b 

W/R# Valid Delay 
(Normal Current Mode) 

9.1 

1.5 

12 

1.5 

11 

a 

tco.7a 

HITM# Valid Delay 
(High Current Mode) 

9.1 

1.5 

12 

1.5 

11 

a,g 

tco.7b 

HITM# Valid Delay 
(Normal Current Mode) 

9.1 

1.5 

13 

1.5 

12 

a 

tco.8 

PWT, PCD, HIT#, CTYP, D/C# M/IO#, 
PCYC, LOCK#, CACHE#, LEN Valid Delay 

9.1 

1.5 

12 

1.5 

11 

a 

tco.9a 

BE0#-BE7# Valid Delay 
(High Current Mode) 

9.1 

1.5 

12 

1.5 

11 

a,g 

tco.9b 

BE0#-BE7# Valid Delay 
(Normal Current Mode) 

9.1 

1.5 

13 

1.5 

12 

a 

tz.1 

Float Time All Outputs 
except D63-D0, DP7-DP0 

9.1 

2 

19 

2 

18 

e 

tz.2 

Float Time D63-D0, DP7-DP0 

9.1 

3 

19 

3 

18 

e 

tzt 

Float Time during Boundary Scan EXTEST 

9.1 


20 


20 

f 


NOTES: 

a. Minimum and maximum delays are for OpF load. 

b. These hold times are referenced to the falling edge of TCK. 

c. These hold times are referenced to the rising edge of CLK. 

d. Output delay for D63-D0, DP7-DP0 is from the CLK after ADS# activation. 

e. Float time = delay until maximum output current is less than ± Ilq. Float time is not tested. 

f. Delay from falling edge of TCK. 

g. These pins can be configured as normal or high-current buffers. When they are configured as high-current buffers for 
interface with cache memory or other large loads, use the derating curves in Figure 9.3. Otherwise, all normal buffers use 
the derating curves in Figure 9.4. 

h. tr and tf should be measured between 0.8V and 2.5V. 

i. Assumes TCK and CLK both at 25 MHz. 
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NOTES: 

Graphs are not linear outside the Cl range shown. 
*Typical part under worst-case conditions. 


Figure 9.5a. Typical Slew Time vs Load Capacitance under Worst-Case Conditions (Rising Voltage) 



NOTES: 

Graphs are not linear outside the Cl range shown. 
^Typical part under worst-case conditions. 


Figure 9.5b. Typical Slew Time vs Load Capacitance under Worst-Case Conditions (Falling Voltage) 
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9.4 Component Buffer Model 

9.4.1 FIRST ORDER ELECTRICAL BUFFER 
MODEL 

The first order electrical buffer model provides an 
accurate and simple representation of the buffers 
used in the inputs and outputs of the CHMOS i860 
XP CPU. The model output consists of four compo- 
nents: 

1. Linear voltage waveform (dV/dt) 

2. Intrinsic buffer delay due to Cl (t©) 

3. Buffer output impedance (Rq) 

4. Buffer output capacitance (Cq) 

as shown in Figure 9.7a 

A fitting algorithm has been used to arrive at values 
for dV/dt, to, Cq, and Rq such that Rq matches the 
actual buffer impedance and Cq, the intrinsic buffer 
output capacitance whether the output is on or off, 
remains constant across the operating range while 
minimizing the difference between the full buffer cir- 
cuit and its simplified electrical model for a set of 
different loads (lumped capacitance, and short and 
long transmission lines). dV/dT is the slope of the 
voltage ramp, while to is the intrinsic buffer delay 
associated with a given Cl- to accounts for the intrin- 
sic delay by offsetting the excitation of the model by 
the amount of the delay. 

NOTE: 

to is zero for Cl = 0 and when the load is repre- 
sented by a transmission line. 



The input model consists of one component, buffer 
capacitance (C|n), as shown in Figure 9.7b. 



Figure 9.7b. Input Model 


9.4.2 FIRST ORDER ELECTRICAL MODEL 
PARAMETER VALUES 


The parameters that make up the first order electri- 
cal model vary with the buffer design. In addition, 
these parameters also vary with the operating condi- 
tion (i.e,, temperature and Vcc) of the buffer pro- 
cess. The typical process corner is being modeled. 
Two sizes of buffer are used on these components, 
labelled here as small and large. The parameter val- 
ues found in Table 9.3 and 9.4 list dV/dt, to, Rq. and 
Cq. These parameters are provided for both low-to- 
high and high-to-low transitions at the typical pro- 
cess corner for three operating conditions (Vcc 
5.5V and Tj = - 10'’C, Vcc = 5.0V and Tj = 80"C, 
and Vcc = 4.5V and Tj = 125“C. 



9.4.3 PACKAGE PARAMETERS 

In addition to the buffer characteristics, package 
characteristics are also Included to complete the 
model. Package inductance, capacitance and resist- 
ance values vary with design geometry and material 
properties of the package. Figure 9.8 shows a model 
of the package including these parameters and 
should be placed between the first order electrical 
buffer model as shown in Figure 9.9 and the board 
interconnects. Notice the package model only in- 
cludes the package inductance (Lp) and capaci- 
tance (Cp). This is sufficient since the package re- 
sistance Is so small it is negligible. 

Table 9.5 lists the buffer model parameters for each 
pin of the I860 XP microprocessor. The table gives 
the package model parameters for each pin, fol- 
lowed by the Input capacitance (input and I/O pins) 
and/or output buffer size (outputs and I/O). In those 
cases where the buffer used by a pin is an option 
selected at reset by the PEN # input, the output buff- 
er column lists the sizes available. Large buffers cor- 
respond to high-current mode, while small buffers 
correspond to normal current mode. 


2-111 





i 860 TM XP MICROPROCESSOR 


IPl^ilLOlMDMI^Y 


iny. 


9.4.4 BOARD INTERCONNECTS 

The board interconnect can be considered as a 
lumped parameter (capacitive load) or as a transmis- 
sion line. As a rule of thumb, an unterminated board 
interconnect may be considered as a capacitive load 
If the round trip time (time for signal to travel from 
one end of the Interconnect to the other and back) is 
Figure 9.8. Package Model compared to the transition time of the signal. 

At frequencies of 50 MHz and above most intercon- 
nects behave as transmission lines (Figure 9.10). 
For accurate results at high frequencies, these 
transmission line effects must be taken into account 
and modeled. 


•-p 


El onnp — 1 

r—B 

“ Cp 


r 


240874-85 


Ro 

Lp 

r- vU 1 

— ^ — 'Tnnp — j " ^ 

dv/di u(t-to)m :: 

N. Cq Cp 

X 

T 

1 

^ ^ 240874-86 


Figure 9.9a. Output Buffer and Package Mode! 




V vv 

•y 

V vv 

r 

- • • • 

r 

240874-88 



Figure 9.9b. Input Buffer and Package Model 


Figure 9.10. Transmission Line Model 
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Transition Vcc 

Low-to-High 5.5 
Low-to-High 5.5 
Low-to-High 5.5 


Low-to-High 


Low-to-High 


Table 9.3. Small Output Buffer First Order Electrical Model Parameter Values 

to (ns) at various Cl 

Vcc /Tcl dV/dT 0 I 5 I 25 I 50 r~i 


Tj 

(C) 

Ro 

(ohms) 

Co 

(pF) 

-10 

28.0 

4.3 

80 

36.4 

4.3 

125 

40.4 

4.3 



wm 

mm 


5.5/1. 2 
5.5/1. 4 
5.5/1. 5 


5.0/1. 2 


0 5 25 50 100 150 

(pF) (pF) (pF) (pF) (pF) (pF) 

0 0.0 0.1 0.3 0.7 1.1 

0 0.0 0.1 0.8 0.8 1.2 

0 0.0 0.1 0.4 0.8 1.2 




























i 860 ™ XP MICROPROCESSOR 


IPl^ilLDIMIDMI^Y 


Iny. 


Table 9.4. Large Output Buffer First Order Electrical Model Parameter Values 


Transition 

Vcc 

Tj 

(C) 

Ro 

(ohms) 

Co 

(pF) 

dV/dT 

to (ns) at various Cl 

0 

(pF) 

5 

(pF) 

25 

(pF) 

50 

(pF) 

100 

(pF) 

150 

(pF) 

200 

(pF) 

250 

(pF) 

300 

(pF) 

Low-to-High 

5.5 

-10 

12.1 

4.3 

5.5/0.7 

0 

0.0 

0.1 

0.3 

0.6 

0.8 

1.0 

1.3 

1.5 

Low-to-High 

5.5 

80 

15.5 

4.3 

5.5/0.9 

0 

0.0 

0.2 

0.3 

0.6 

0.9 

1.1 

1.4 

1.7 

Low-to-High 

5.5 

125 

17.2 

4.3 

5.5/1. 1 

0 

0.0 

0.2 

0.4 

0.7 

1.0 

1.2 

1.4 

1.7 

Low-to-High 

5.0 

-10 

13.0 

4.3 

5.0/0.9 

0 

0.0 

0.1 

0.3 

0.6 

0.9 

1.1 

1.4 

1.7 

Low-to-High 

5.0 

80 

16.7 

4.3 

5.0/1. 0 

0 

0.0 

.0.2 

0.4 

0.8 

1.1 

1.4 

1.7 

2.0 

Low-to-High 

5.0 

125 

18.5 

4.3 

5.0/1. 2 

0 

0.0 

0.2 

0.4 

0.8 

1.1 

1.4 

1.7 

2.0 

Low-to-High 

4.5 

-10 

14.1 

4.3 

4.5/0.9 

0 

0.0 

0.2 

0.4 

0.7 

1.1 

1.4 

1.7 

2.0 

Low-to-High 

4.5 

80 

18.0 

4.3 

4.5/1. 2 

0 

0.0 

0.2 

0.4 

0.9 

1.2 

1.5 

1.9 

2.2 

Low-to-High 

4.5 

125 

19.9 

4.3 

4.5/1. 3 

0 








m 

High-to-Low 

5.5 

-10 

10.6 

4.3 

5.5/0.7 

0 

0.0 

0.3 

0.6 

0 9 



1 P 



'15 

1.8 

2.0 

High-to-Low 

5.5 

80 

13.9 

4.3 

5.5/1. 0 

0 

0.0 

0.4 




1.9 

2.2 

2.5 

High-to-Low 

5.5 

125 

15.8 

4.3 

5.5/1. 1 

0 



0.8 

1.3 

m 

2.0 

2.4 

2.8 

High-to-Low 

5.0 

-10 

11.0 

4.3 

5.0/0.8 

0 


£01 




1.6 

1.9 

2.1 

High-to-Low 

5.0 

80 

14.5 

4.3 

5.0/1.0 

0 







2.3 

2.6 

High-to-Low 

5.0 

125 

16.5 

4.3 

5.0/1. 2 

0 





1.7 

2.1 

2.5 

2.8 

High-to-Low 

4.5 

-10 

11.3 

4.3 

4.5/0.9 

0 






1.7 

2.0 

2.4 

High-to-Low 

4.5 

80 

15.2 

4.3 

4.5/1. 2 

0 

0.0 

0.4 

0.8 

1.3 

1.6 

2.0 

2.3 

2.7 

High-to-Low 

4.5 

125 

17.4 

4.3 

4.5/1. 3 

0 

0.0 

0.4 

0.8 

1.3 

1.7 

2.1 

2.5 

2.8 
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Table 9.5 Buffer Models 


Pin Name 

Location 

Cp (pF) 
Typical 

Lp (nH) 
Typical 

Input 

Buffer 

C|N (pF) 
Typical 

Output 

Buffer 

Size 

(Large or Small) 

As 

SOI 

7.6 

13.8 

6.7 

US 

A4 

R01 

6.2 

14.5 

6.7 

US 

A 5 

M05 

6.5 

7.8 

6.7 

US 

Ae 

LOS 

5.3 

8.0 

6.7 

US 

Ay 

F 01 

7.7 

16.2 

6.7 

US 

As 

K05 

5.1 

7.7 

6.7 

US 

Ag 

E01 

8.0 

16.4 

6.7 

L/S 

A 10 

K04 

5.1 

8.8 

6.7 

US 

A 11 

D01 

8.3 

16.8 

6.7 

L/S 

Ai2 

J04 

5.2 

9.0 

6.7 

L/S 

Ai3 

C 01 

8.7 

17.2 

6.7 

L/S 

Ai4 

JOS 

5.2 

7.8 

6.7 

L/S 

Ai5 

B01 

9.0 

17.8 

6.7 

L/S 

Ai6 

H04 

5.2 

9.0 

6.7 

L/S 

Ai7 

A01 

9.4 

18.2 

6.7 

L/S 

Ai8 

C03 

7.8 

14.5 

6.7 

L/S 

Ai9 

C02 

9.0 

15.3 

6.7 

L/S 

A 20 

H05 

7.5 

7.7 

6.7 

L/S 

A 21 

B02 

8.5 

15.7 

6.7 

L/S 

A 22 

G04 

7.5 

9.1 

4.4 

S 

A 23 

A02 

8.1 

15.7 

4.4 

S 

A 24 

B03 

7.0 

14.5 

4.4 

S 

A25 

A03 1 

7.7 

14.6 

4.4 

S 

A26 

G05 

6.7 

7.9 

4.4 

S 

A27 

E04 

CD 

9.6 

4.4 

S 

A 28 

F04 

6.5 

9.2 

4.4 

S 

A 29 

D04 

7.4 

10.0 

4.4 

S 

A 30 

F05 

5.9 

8.2 

4.4 

S 

A31 

C04 

6.6 

10.4 

4.4 

S 

ADS# 

N04 

6.2 

9.1 


L/S 

AHOLD 

QOS 

6.0 

8.8 

2.0 


BEO# 

EOS 

5.7 

8.8 


L/S 

BE1# 

DOS 

6.7 

00 

06 


L/S 

BE2# 

D06 

5.7 

9.0 


L/S 
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Table 9.5. Buffer Models (Continued) 


Pin Name 

Location 

Cp(pF) 

Typical 

— 

Lp(nH) 

Typical 

Input 

Buffer 

C|n(PF) 

Typical 

Output 

Buffer 

Size 

(Large or Small) 

BE3# 

B04 

6.S 

11.2 


L/S 

BE4# 

COS 

S.9 

10.6 


US 

BE5# 

A04 

6.S 

12.0 


L/S 

BE6# 

D07 

4.9 

8.6 


L/S 

BE7# 

AOS 

6.1 

11.6 


L/S 

BERR 

S07 

6.8 

8.7 

2.0 


BOFF# 

R04 

6.3 

10.4 

2.0 


RSRVD 

P04 

CD 

9.4 

2.0 


BRDY# 

U01 

8.0 

14.7 

2.0 


BREQ 

R09 

4.4 



7.6 





Q 

T 1 

BYPASS# 

A06 

Strapping Option 

CACHE# 

Q04 

6.6 

9.8 


s 

CLK 

M04 

6.2 

8.9 

2.0 


CTYP 

POS 

6.6 

8.6 


s 

Do 

A07 

6.6 

10.6 

4.4 

s 

Di 

D09 

7.6 

7.6 

4.4 

s 

Da 

A13 

7.4 

16.0 

4.4 

s 

Da 

A16 

7.7 

17.7 

4^4 

s 

D 4 

A17 

9.2 

17.9 

4.4 

s 

Ds 

DIO 

7.6 

7.6 

4.4 

s 

De 

A18 

9.4 

18.3 

4.4 

s 

D 7 

Cl 7 

8.6 

16.9 

4.4 

s 

Da 

Cl 6 

8.6 

14.6 

4.4 

s 

Dg 

B16 

CD 

CO 

14.7 

4.4 

s 

D 1 O 

Dll 

8.3 

7.6 

4.4 

s 

D 11 

B17 

8.9 

14.7 

4.4 

s 

Di2 

CIS 

8.1 

7.8 

4.4 

s 

Di3 

B18 

8.6 

16.4 

4.4 

s 

Di4 

D12 

7.2 

7.8 

4.4 

s 

Di5 

B19 

8.2 

16.6 

4.4 

s 

Die 

C18 

7.9 

10.7 

4.4 

s 

Di7 

D14 

6.7 

9.2 

4.4 

s 

Die 

C19 

7.6 

14.2 

4.4 

s 

Di9 

D1S 

6.4 

10.0 

4.4 

s 
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Table 9.5 Buffer Models (Continued) 


Pin Name 

Location 

Cp(pF) 

Typical 

Lp(nH) 

Typical 

Input 

Buffer 

C|N (PF) 
Typical 

Output 

Buffer 

Size 

(Large or Small) 

D 20 

D16 

7.4 

10.7 

4.4 

S 

□21 

E15 

5.6 

8.8 

4.4 

S 

D 22 

D19 

6.7 

12.7 

4.4 

s 

D 23 

E16 

5.5 

9.7 

4.4 

s 

D 24 

FI 5 

5.3 

8.3 

4.4 

s 

D 25 

E17 

6.6 

9.9 

4.4 

s 

D 26 

F16 

5.3 

9.7 

4.4 

s 

D 27 

FI 9 

6.2 

11.7 

4.4 

s 

D 28 

G15 

5.1 

7.9 

4.4 

s 

D 29 

G19 

6.2 

11.8 

4.4 

§ 

D 30 

G16 

5.1 

8.9 

4.4 

s 

D 31 

M19 

8.6 

16.2 

4.4 

s 

D 32 

H15 

5.2 

7.7 

4.4 

s 

D 33 

R18 

11.0 


4.4 

s 

D 34 

P19 

8.0 

18.4 

4,4 

s 

D 35 

R19 

9.1 

18.8 

4.4 

s 

D 36 

N19 

8.1 

16.9 

4.4 

s 

D 37 

S19 

9.2 

20.7 

4.4 

s 

D 38 

J16 

8.4 

8.9 

4.4 

s 

D 39 

T19 

10.5 

19.6 

4.4 

s 

D 40 

U19 

10.8 

19.1 

4.4 

s 

D 41 

LI 6 

8.3 

10.9 

4.4 

s 

D 42 

T18 

10.5 

17.8 

4.4 

s 

D 43 

K16 

8.4 

8.8 

4.4 

s 

D 44 

U18 

10.1 

17.7 

4.4 

s 

D 45 

K15 

9.3 

7.5 

4.4 

s 

D 46 

S17 

9.5 

14.5 

4.4 

s 

D 47 

M16 

8.0 

9.8 

4.4 

s 

D 48 

LI 5 

8.0 

7.7 

4.4 

s 

D 49 

T17 

8.7 

14.6 

4.4 

s 

D 50 

N16 

7.8 

9.9 

4.4 

s 

D 51 

U17 

8.6 

15.2 

4.4 

s 

D 52 

S18 

7.6 

14.3 

4.4 

s 
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Table 9.5 Buffer Models (Continued) 


Pin Name 

Location 

Cp(pF) 

Typical 

Lp(nH) 

Typical 

Input 

Buffer 

C|N (PF) 
Typical 

Output 

Buffer 

Size 

(Large or Small) 

D53 

M15 

7.7 

7.1 

4.4 

S 

D54 

Q16 

7.0 

11.1 

4.4 

S 

D55 

U16 

8.0 

14.3 

4.4 

s 

D56 

T16 

7.8 

12.8 

4.4 

s 

D57 

R16 

6.5 

11.8 

4.4 

s 

D58 

S16 

7.5 

11.3 

4.4 

s 

D59 

P15 

6.2 

8.7 

4.4 

s 

Deo 

R15 

7.1 

9.6 

4.4 

s 

Dei 

Q15 

5.9 

9.3 

4.4 

s 

Dor* 

815 

6.9 

10 7 

A A 

c* 

Dea 

R14 

5.6 

9.7 

4.4 

s 

D/C# 

R05 

5.8 

9.7 


s 

DPO 

A15 

7.7 

18.3 

4.4 

s 

DPI 

A19 

9.7 

18.9 

4.4 

s 

DP2 

D13 

7.1 

8.5 

4.4 

s 

DP3 

D18 

6.7 

11.3 

4.4 

s 

DP4 

Q19 

10.4 

19.0 

4.4 

s 

DP5 

J15 

9.9 

7.7 

4.4 

s 

DP6 

P16 

9.3 

10.7 

4.4 

s 

DP7 

N15 

6.8 

8.9 

4.4 

s 

EADS# 

305 

5.5 

10.5 

2.0 


EWBE# 

D08 

7.5 

7.6 

2.0 


FLINE# 

S08 

5.4 

8.1 

2.0 


HIT# 

S12 

5.9 

11.1 


s 

HUM# 

NOS 

6.2 

8.2 


L 

HLDA 

S09 

5.3 

7.9 


s 

HOLD 

R12 

6.1 

11.1 

2.0 


INT/CS8 

S06 

5.2 

10.0 

2.0 


INV 

R07 

5.3 

8.2 

2.0 


KBO 

R11 

6.1 

9.2 


s 

KB1 

S10 

6.4 

7.9 


s 

KEN# 

U02 

7.4 

13.4 

2.0 


LEN 

T02 

7.9 

12.8 


s 
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Table 9.5 Buffer Models (Continued) 


Pin Name 

Location 

Cp(pF) 

Typical 

Lp(nH) 

Typical 

Input 

Buffer 

C|n(pF) 

Typical 

Output 

Buffer 

Size 

(Large or Small) 

LOCK# 

S03 

7.7 

11.2 


S 

M/IO# 

S04 

7.3 

10.3 


S 

NA# 

U03 

7.1 

13.0 

2.0 


NENE# 

S11 

6.3 

9.6 


s 

PCD 

R06 

5.6 

8.9 


s 

PCHK# 

H16 

5.1 

8.8 


s 

PCYC 

T04 

7.2 

11.4 


s 

PEN# 

R08 

4.8 

7.8 

2.0 


PWT 

T03 

7.4 

12.1 


s 

RESET 

S02 

7.9 

12.5 

2.0 


SPARE 

L04 




NC 

TCK 

Q01 

5.8 

14.1 

2.0 


TDI 

S14 

6.5 

9.8 

2.0 


TDO 

RIO 

6.3 

7.6 


S 

TMS 

R13 

5.6 

9.6 

2.0 


TRST# 

S13 

6.3 

9.6 

2.0 


W/R# 

T01 

7.8 

14.3 


L/S 

WB/WT# 

U04 

6.7 

12.3 

2.0 
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10.0 INSTRUCTION SET 

Key to abbreviations: 

For register operands, the abbreviations that de- 
scribe the operands are composed of two parts. The 
first part describes the type of register: 

c One of the control registers fir, psr, epsr, 
dirbase, db, fsr, bear, ccr, pO, p1, p2, or p3 

f One of the floating-point registers: fO through 

f31 

/ One of the integer registers: rO through r31 

The second part identifies the field of the machine 
instruction into which the operand is to be placed: 

srcl The first of the two source-register desig- 
nators, which may be either a register or a 
1 6-bit immediate constant or address off- 
set. The immediate value Is zero-extended 
for ioyicai operaliuns and is siyn-exlenued 
for add and subtract operations (including 
addu and subu) and for all addressing cal- 
culations. 

srcini Same as src1 except that no immediate 
constant or address offset value is permit- 
ted. 

srcis Same as src1 except that the immediate 
constant is a 5-bit value that is zero-ex- 
tended to 32 bits. 

src2 The second of the two source-register des- 
ignators. 

dost The destination register designator. 

Thus, the operand specifier isrc2, for example, 
means that an integer register is used and that the 
encoding of that register must be placed in the src2 
field of the machine instruction. 

Other (nonregister) operands are specified by a one- 
part abbreviation that represents both the type of 
operand required and the instruction field into which 
the value of the operand is placed: 

# const A 16-bit Immediate constant or address off- 
set that the i860 XP microprocessor sign- 
extends to 32 bits when computing the ef- 
fective address. 

Ibroff A signed, 26-bit, immediate, relative branch 
offset. 

sbroff A signed, 16-bit, immediate, relative branch 
offset. 


brx A function that computes the target ad- 
dress by shifting the offset (either Ibroff or 
sbroff) left by two bits, sign-extending it to 
32 bits, and adding the result to the current 
instruction pointer plus four. The resulting 
target address may lie anywhere within the 
address space. 


Table 10.1. Precision Specification 


Suffix 

Source Precision 

Result Precision 

.ss 

single 

single 

.sd 

single 

double 

.dd 

double 

double 

.ds 

double 

single 


Unless otherwise specificed, floating-point operations ac- 
cept single- or double-precision source operands and pro- 
duce a result of equal or greater precision. Both input oper- 
ands must have the same precision. The source and result 
precision are specified by a two-letter suffix to the mne- 


Other abbreviations include: 


•P 

Precision specification .ss, 
.sd, or .dd (.ds not permit- 
ted). Refer to Table 10.1. 

.r 

Precision specification .ss, 
.sd, .ds, or .dd. Refer to 
Table 10.1. 

.V 

.sd or .dd Refer to Table 
10.1. 

.w 

.ss or .dd. Refer to Table 
10.1. 

.X 

.b (8 bits), .s (16 bits), or .1 
(32 bits) 

■y 

.1 (32 bits), .d (64 bits), or 
.q (1 28 bits) 

mem.x(address) 

The memory location indi- 
cated by address with a 
size of X. 

port.x(address) 

The I/O port indicated by 
address with a size of x. 

int^vector.x(address) The interrupt vector with a 
size of X returned from I/O 
port address. 

PM 

The pixel mask, which is 
considered as an array of 
eight bits PM(7)..PM(0), 
where PM(0) is the least- 
significant bit. 
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10.1 Instruction Definitions in Aiphabetical Order 

adds isrd, isrc2, iciest Add Signed 

iciest isrd + isrc2 

OF ^ (bit 31 carry ^ bit 30 carry) 

CC set if isrc2 + isrd < 0 (signed) 

CC clear if isrc2 + isrd ^ 0 (signed) 

addu isrd, isrc2, iciest Add Unsigned 

iciest <r- isrd + isrc2 
OF bit 31 carry 

CC bit 31 carry 


and isrd, isrc2, iciest 

iciest ^ isrd and isrc2 

CC set if result Is zero, cleared otherwise 

andh * const, isrc2, iciest 

iciest <— (#d70A7s/ shifted left 16 bits) and isrc2 
CC set If result is zero, cleared otherwise 


Logical AND 


Logical AND High 



andnot isrd, isrc2, iciest Logicai AND NOT 

idest (not isrd) and isrc2 

CC set If result Is zero, cleared otherwise 


andnoth * const, isrc2, idest Logicai AND NOT High 

idest <— (not (#C70/7S/ shifted left 16 bits)) and isrc2 
CC set if result Is zero, cleared otherwise 


be ibroff Branch on CC 

IF CC = 1 

THEN continue execution at brx(ibroff) 

FI 

bc.t ibroff Branch on CC, Taken 

IF CC = 1 

THEN execute one more sequential instruction 

continue execution at brx(ibroff) 

ELSE skip next sequential instruction 

FI 


bla isrdni, isrc2, sbroff Branch on LCC and Add 

LCC-temp clear if isrc2 + isrdni < 0 (signed) 

LCC-temp set if isrc2 + isrdni ^ 0 (signed) 

isrc2 isrdni + isrc2 

Execute one more sequential Instruction 

IF LCC 

THEN LCC ^ LCC-temp 

continue execution at brx(sbroff) 

ELSE LCC ^ LCC-temp 

FI 


bnc ibroff Branch on Not CC 

IF CC = 0 

THEN continue execution at brx(lbroff) 

FI 
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bnc.t Ibroff Branch on Not CC, Taken 

IF CC = 0 

THEN execute one more sequential instruction 

continue execution at brx(lbroff) 

ELSE skip next sequential instruction 

FI 

br Ibroff Branch Direct Unconditionally 


Execute one more sequential instruction. 
Continue execution at brx(lbroff). 


bri [isrc1ni\ Branch indirect Unconditionally 

Execute one more sequential instruction 


IF 
THEN 


any trap bit in psr is set 
copy PU to U, PIM to IM in psr 
clear trap bits 


IF 
THEN 

ELSE 


DS Is set and DIM is reset 
enter dual-instruction mode after executing one 
instruction in single-instruction mode 


IF 
THEN 

ELSE 


FI 


DS is set and DIM is set 
enter single-instruction mode after executing one 


IF 

THEN 

ELSE 

FI 


inSiruCiiOii ill uuai-ii idiiuCuOi I IllUUt^ 

DIM is set 

enter dual-instruction mode 
for next instruction pair 
enter single-instruction mode 
for next instruction pair 


FI 


FI 

Continue execution at address In isrclni 

(The original contents of isrcini ls used even if the next instruction 
modifies isrclni. Does not trap if isrclni is misaligned.) 


bXeisrcIs, isrc2, sbroff ) Branch if Equal 

IF isrds = isrc2 

THEN continue execution at brx(sbroff) 

FI 


btne isrds, isrc2, sbroff Branch If Not Equal 

IF isrds ^ isrc2 

THEN continue execution at brx(sbroff) 

FI 


call ibroff Subroutine Call 

r1 address of next sequential instruction + 4 (or + 8 in dual mode) 

Execute one more sequential Instruction 
Continue execution at brx(lbroff) 

call! [isrdni\ Indirect Subroutine Call 


r1 address of next sequential Instruction + 4 (or + 8 in dual mode) 
Execute one more sequential instruction 
Continue execution at address In isrclni 

(The original contents of isrclni \s used even if the next instruction 
modifies isrdni. Does not trap If isrclni is misaligned. The 
register isrdni must not be r1.) 


Floating-Point Add 


fadd.p fsrd, fsrc2, fdest — 
fdest fsrd + fsrc2 
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faddp fsrcl, fsrc2, fdest Add with Pixel Merge 

fdest fsrd 4- /srci' (using integer arithmetic; 8-byte operands and destination) 

Shift and load MERGE register from fsrd + fsrc2 as defined in Table 10.2 

faddz fsrd, fsrc2, fdest Add with Z Merge 

fdest <— fsrd + fsrc2 (us\nq integer arithmetic; 8-byte operands and destination) 

Shift MERGE right 16 and load fields 31. .16 and 63..48 from fsrd + fsrc2 

famov.r fsrd, fdest Floating-Point Adder Move 

fdest fsrd 


fiadd.w fsrd, fsrc2, fdest 

fdest <r- fsrd + fsrc2 (2’s complement integer arithmetic) 


Long-Integer Add 


fisub.w fsrd, fsrc2, fdest Long-Integer Subtract 

frdest <— fsrd - fsrc2 (2’s complement integer arithmetic) 

flx.v fsrd, fdest Floating-Point to Integer Conversion 

fdest 64-bit value with low-order 32 bits equal to integer part of fsrd rounded 


Floating-Point Load 

fid.y isrd(isrc2), fdest (Norrhal) 

i\6.y Isrd(isrc2)+ +, fdest .(Autoincrement) 

fdest mem.y {fsrd + isrc2) 

IF autoincrement 

THEN isrc2 isrd -f isrc2 

FI 

Cache Flush 

flush * const(isrc2) (Normal) 

flush *const(isrc2)-\- + (Autoincrement) 

Write back (If modified) the line in data cache that has address (# const ■\-isrc2) 

80860XR: and set tag value to (* const + isrc2). 

80860XP: and invalidate Its virtual and physical tags. 

Contents of line undefined. 

IF autoincrement 

THEN isrc2 * const + isrc2 

FI 



fmlow.dd fsrd, fsrc2, fdest Floating-Point Multiply Low 

fdest low-order 53 bits of {fsrd mantissa x fsrc2 mantissa) 
fdest bit 53 most significant bit of (^src/ mantissa x fsrc2 mantissa) 


imoy. T fsrd, fdest 

Assembler pseudo-operation 
fmov.ss fsrd, fdest 
fmov.dd fsrd, fdest 
fmov.sd fsrd, fdest 
fmov.ds fsrd, fdest 


Floating-Point Reg-Reg Move 


fiadd.ss fsrd, fO, fdest 
fladd.dd fsrd, fO, fdest 
famov.sd fsrd, fdest 
famov.ds fsrd, fdest 


fmul.p fsrd, fsrc2, fdest — 
fdest fsrd x fsrc2 


Floating-Point Multiply 


fnop 

Assembler pseudo-operation 

fnop = shrd rO, rO, rO 


Floating-Point No Operation 
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form fsrd, fdest OR with MERGE Register 

fdest ^ fsrd OR MERGE 
MERGE ^0 

frcp.p fsrc2, fdest Floating-Point Reciprocal 

fdest <r— 1 / with maximum mantissa error < 2“7 

frsqr.p fsrc2, fdest . Floating-Point Reciprocal Square Root 

fdest 1 / 4fsrc2 with maximum mantissa error < 2~7 


fst.y fdesf isrd(isrc2) 

isX.y fdest, isrd(isrc^+ + 

mem.y (isrc2 + isrd) <r- fdest 
IF autoincrement 
THEN isrc2 isrd + isrc2 
FI 

fsub.p fsrd, fsrc2, fdest Floating-Point Subtract 

fdest fsrd - fsrc2 

firunc.v fsrd, fdesr Floating-Point to Integer Conversion 

fdest 64-bit value with low-order 32 bits equal to integer part of fsrd 

fxfr fsrd, idest Transfer F-P to Integer Register 

idest <r— fsrd 

fzchki fsrd, fsrc2, fdest 32-Bit Z-Buffer Check 

Consider the 64-bit operands as arrays of two 32-bit 

fields fsrd(^)..fsrd(^), fsrc2(^)..fsrc2(0), and fdest(^)..fdest(0) 
where zero denotes the least-significant field. 

PM PM shifted right by 2 bits 
FOR i = 0 to 1 
DO 

PM [i + 6] ^ fsrc2{^ ^ fsrd{i) (unsigned) 
fdestij) smaller of fsrc2{j) and fsrdiS) 

OD 

MERGE ^0 

fzchks fsrd, fsrc2, fdest 16-Blt Z-Buffer Check 

Consider the 64-bit operands as arrays of four 16-bit 

fields fsrd(2):.fsrd(0), fsrc2(Z)..fsrc2(0), and fdest(3)Jdest(0) 
where zero denotes the least-significant field. 

PM <—• PM shifted right by 4 bits 
FOR I = 0 to 3 
DO 

PM [I + 4] fsrc2ij) ^ fsrdij) (unsigned) 
fdest{() smaller of fsrc2{)) and fsrdij) 

OD 

MERGE ^ 0 


intovr Software Trap on Integer Overflow 

IF OF = 1 

THEN generate trap with IT set in psr 
FI 

ixfr isrdni, fdest Transfer Integer to F-P Register 

fdest <r— isrdni 


Floating-Point Store 

(Normal) 

(Autoincrement) 
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Id.c csrc2, idest Load from Control Register 

idest csrc2 

Id.x isrc1{isrc2), idest Load Integer 

idest <— mem.x (isrd + isrc2) 

Idint.x isrc2, idest Load Interrupt Vector 

idest ^ int_^vector.x (isrc2) 

NOTE: Not available with the i860 XR CPU 

Idio.x isrc2, idest Load I/O 

idest port.x (isrc2) 

NOTE: Not available with the i860 XR CPU 


lock Begin Interlocked Sequence 

Set BL in dirbase. 

The next load or store that appears on the bus locks that location. 

Disable interrupts until the bus is unlocked. 

mov isrc2, idest Register-Register Move 

Assembler pseudo-operation 

mov isrc2, idest = shl rO, isrc2, idest 



mov const32, idest Constant-to-Register Move 

Assembler pseudo-operation 

when 0xFFFF8000 ^ const32 < 0x8000 . . . 

adds l%const32, rO, idest 
otherwise . . . 

OTh h%const32, rO, idest 
or l%const32, idest, idest 


nop Core-Unit No Operation 

Assembler pseudo-operation 

nop = shl rO, rO, rO 

ox isrd, isrc2, idest Logical OR 

idest <r— isrd OR isrc2 

CC set if result is zero, cleared otherwise 

orh * const, isrc2, idest Logical OR high 

idest shifted left 16 bits) OR isrc2 

CC set if result is zero, cleared otherwise 

pfadd.p fsrd, fsrc2, fdest Pipelined Floating-Point Add 

fdest last stage adder result 
Advance A pipeline one stage 
A pipeline first stage ^ fsrd + fsrc2 

pfaddp fsrd, fsrc2, fdest Pipelined Add with Pixel Merge 

fdest ^ last-stage graphics-unit result 
last-stage graphics-unit result fsrd + fsrc2 

(using integer arithmetic; 8-byte operands and destination) 

Shift, then load MERGE register from fsrd + fsrc2 as defined in Table 10.2 

pfaddz fsrd, fsrc2, fdest Pipelined Add with Z Merge 

frdest last-stage graphics-unit result 
last-stage graphics-unit result fsrd + fsrc2 

(using integer arithmetic; 8-byte operands and destination) 

Shift MERGE right 16, then load fields 31. .16 and 63..48 tromfsrd + fsrc2 
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pfam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest <r— last stage adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage A-op1 + A-op2 
M pipeline first stage M-op1 x M-op2 

pfamov.r fsrd, fdest Pipelined Floating-Point Adder Move 

fdest last stage adder result 

Advance A pipeline one stage 
A pipeline first stage fsrd 

pfeq.p fsrd, fsrc2, fdest Pipelined Floating-Point Equal Compare 

fdest <— last stage adder result 
CC set if fsrd = fsrc2, else cleared 
Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 

pfgt.p fsrd, fsrc2, fdest Pipelined Floating-Point Greater-Than Compare 

(Assembler clears R-bit of instruction) 
fdest last stage adder result 

CC set if fsrd > fsrc2, else cleared 
Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 

pfiadd.w fsrd, fsrc2, fdest Pipelined Long-Integer Add 

fdest <r- last-stage graphics-unit result 

last-stage graphics-unit result fsrd + fsrc2 (2’s complement integer arithmetic) 

pfisub.w fsrd, fsrc2, fdest Pipelined Long-Integer Subtract 

fdest ^ last-stage graphics-unit result 

last-stage graphics-unit result ^ fsrd - fsrc2 (2's complement Integer arithmetic) 

pfix.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest last stage adder result 
Advance A pipeline one stage 

A pipeline first stage <— 64-bit value Nwith low-order 32 bits 
equal to integer part of fsrd rounded 

Pipelined Floating-Point Load 

— ... .(Normal) 
(Autoincrement) 


pfie.p fsrd, fsrc2, fdest Pipelined F-P Less-Than or Equal Compare 

Assembler sets R-bit of instruction 
fdest last stage adder result 
CC clear if fsrd ^ fsrc2, else set 
Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 


pfid.y isrd(isrc2), fdest 

pfid.y isrd(isrc2) + + , fdest 

fdest mem.y (third previous pfid’s (isrd + isrc2)) 
(where .y is precision of third previous pfid.y) 

IF autoincrement 

THEN isrc2 <r— isrd + isrc2 

FI 

NOTE: pfid.q is not available with the I860 XR CPU 
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pfmam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest <— last stage multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage A-op1 + A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfmov.r fsrd, fdest Pipelined Floating-Point Reg-Reg Move 

Assembler pseudo-operation 

pfmov.ss fsrd, fdest = pfiadd.ss fsrd, fO, fdest 
pfmov.dd fsrd, fdest = pfiadd.dd fsrd, fO, fdest 
pfmov.sd fsrd, fdest = pfamov.sd fsrd, fdest 
pfmov.ds fsrd, fdest = pfamov.ds fsrd, fdest 


pfmsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest <— last stage multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage A-op1 - A-op2 
M pipeline first stage <r- M-op1 x M-op2 

pfmul.p fsrd, fsrc2, fdest Pipelined Floating-Point Multiply 

fdest <r- last stage multiplier result 
Advance M pipeline one stage 
M pipeline first stage fsrd X fsrc2 



pfmulS.dd fsrd, fsrc2, fdest. Three-Stage Pipelined Multiply 

fdest last stage multiplier result 
Advance 3-Stage M pipeline one stage 
M pipeline first stage <r- fsrd x fsrc2 


pform fsrd, fdest Pipelined OR to MERGE Register 

fdest last-stage graphics-unit result 
last-stage graphics-unit result <— fsrd OR MERGE 
MERGE 0 


pfsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest last stage adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 - A-op2 
M pipeline first stage M-op1 x M-op2 


pfsub.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract 

fdest <— last stage adder result 
Advance A pipeline one stage 
A pipeline first stage fsrd — fsrc2 


pftrunc.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest last stage adder result 
Advance A pipeline one stage 

A pipeline first stage 64-bit value with low-order 32 bits 
equal to integer part of fsrd 
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pfzchki fsrd, fsrc2, fdest Pipelined 32-Bit Z-Buffer Check 


Consider the 64-bit operands as arrays of two 32-bit 

fields fsrd (1).. fsrd (0), fsrc2(1)..fsrc2{0), and fdest{^),.fdest{0) 
where zero denotes the least-significant field. 

PM PM shifted right by 2 bits 
FOR i = 0 to 1 

DO . 

PM [i + 6] fsrc2(\) ^ fsrdij) (unsigned) 
fdest(\) <r- last-stage graphics-unit result 
last-stage graphics-unit result smaller of /src2(i) and fsrd 
OD 

MERGE 0 

pfzchks fsrd, fsrc2, fdest Pipelined 16-Blt Z-Buffer Check 

Consider the 64-bit operands as arrays of four 16-bit 

fields fsrd [3).. fsrd (0), fsrc2(3)..fsrc2(0), and fdest{3)..fdest(0) 
where zero denotes the least-significant field. 

PM PM shifted right by 4 bits 
FOR i = 0 to 3 
DO 

PM [i + 4] fsrc2(\) ^ fsrd{}) (unsigned) 
raest last-stage graphics-unit result 

last-stage graphics-unit result(i) <— smaller of fsrc2{j) and fsrd(\) 

OD 

MERGE 4- 0 

pst.d fdest, * const(isrc2) Pixel Store 

psXsi fdest, #const(isrc2)+ Pixel Store Autoincrement 

Pixels enabled by PM in mem.d (isrc2 + * const) fdest 

Shift PM right by 8/pixel size (in bytes) bits 

IF autoincrement 

THEN isrc2 * const + isrc2 

FI 

scyc-x isrc2 Special Cycles 

Generate a special bus cycle (D/C# =0, W/R# = 1, M/IO# = 0) and 
set BE7#-BE0# according to the value contained in the register isrc2 
NOTE: Not available with the i860 XR CPU 

shl isrd, isrc2, idest Shift Left 

idest <— isrc2 shifted left by isrd bits 

shr isrd, isrc2, idest Shift Right 

SC (in psr) isrd 

Idest ^ shifted right by isrd bits 

shra isrd, isrc2, idest Shift Right Arithmetic 

idest ^ isrc2 arithmetically shifted right by isrd bits 

shrd isrdni, isrc2, idest Shift Right Double 

idest low-order 32 bits of isrc1ni:isrc2 shifted right by SC bits 

st.c isrdni, csrc2 Store to Control Register 

csrc2 ^ srdni 

st.x isrc Ini, # const(isrc2) Store I nteger 

mem.x (isrc2 + if const) <r- isrdni 
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stio.x isrcini, isrc2 

port.x (isrc2) isrdni 

NOTE: Not available with the i860 XR CPU 

Store I/O 

subs isrd, isrc2, idest 

idest <r— isrd - isrc2 

OF (bit 31 carry ^ bit 30 carry) 

CC set if isrc2 > isrd (signed) 

CC clear if isrc2 ^ isrd (signed) 

Subtract Signed 

subu isrd, isrc2, idest 

idest isrd - isrc2 

OF ^ NOT (bit 31 carry) 

CC bit 31 carry 

(I.e. CC set if isrc2 ^ isrd (unsigned) 

CC clear if isrc2 > isrd (unsigned)) 

Subtract Unsigned 

trap isrdni, isrc2, idest 

Generate trap with IT set in psr 

Software Trap ^ 

unlock 

Clear BL in dirbase. The next load or store 
unlocks the bus. Interrupts are enabled. 

End interlocked Sequence 

xor isrd, isrc2, idest 

Logical Exclusive OR 


idest <r— jsrcl XOR /src2 

CC set if result is zero, cleared otherwise 

xorh # const, isrc2, idest Logical Exclusive OR High 

idest <r— (#c<9A7s^ shifted left 16 bits) XOR isrc2 
CC set if result Is zero, cleared otherwise 


Table 10.2. FADDP MERGE Update 


Pixel Size 
(from PS) 

Fields Loaded from 

Result into MERGE 

Right Shift Amount 
(Field Size) 

8 

63..56, 

47..40, 

31. .24, 

15..8 

8 

16 

63..58, 

47..42, 

31. .26, 

15..10 

6 

32 

63..56, 


31. .24 


8 
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10.2 Instruction Format and Encoding 

All instructions are 32 bits long and begin on a four- 
byte boundary. When operands are registers, the 
encodings shown in Table 10.3 are used. 

There are two general core-instruction formats 
(REG-format and CTRL-format) and a separate for- 
mat for floating-point instructions. 


Table 10.3. Register Encoding 


Register 

Encoding 

rO 

0 

r31 

31 

fO 

0 

f31 

31 

Fault Instruction 

0 

Processor Status 

1 

Directory Base 

2 

Data Breakpoint 

3 

Floating-Point Status 

4 

Extended Processor Status 

5 

Bus Error Address* 

6 

Concurrency Control* 

7 

pO* 

8 

pi* 

9 

p2* 

10 

P3* 

11 


NOTE: 

*Available only with i860 XP CPU. Using these encodings 
with the i860 XR CPU produces undefined results. 


10.2.1 REG-FORMAT INSTRUCTIONS 

Within the REG-format are several variations as 
shown in Figure 10.1. Table 10.4 gives the encod- 
ings for these instructions. One encoding is an es- 
cape code that defines yet another variation: the 
core escape instructions. Figure 10.2 shows the for- 
mat of this group, and Table 10.5 shows the encod- 
ings. 

In these Instructions, the src2 field selects one of 
the 32 integer registers (most instructions) or one of 
the control registers (st.c and Id.c). Dest selects 
one of the 32 Integer registers (most Instructions) or 
floating-point registers (fid, fst, pfid, pst, Ixfr). For 
instructions where src1 is optionally an immediate 
value, bit 26 of the opcode (l.-bit) indicates whether 
src1 Is an Immediate. If bit 26 is clear, an integer 
register is used; if bit 26 is set, src1 is contained in 
the low-order 16 bits, except for bte and btne 
instructions. For bte and btne, the five-bit immediate 
value is contained in the src1 field. For st, bte, btne, 
and bla, the upper five bits of the offset or broffset 
are contained in the desi field instead of src1, and 
the lower 11 bits of offset are the lower 11 bits of 
the instruction. 


For Id and st, bits 28 and zero determine operand 
size as follows: 


Bit 28 

BItO 

Operand Size 

0 

0 

8-bits 

0 

1 

8-bits 

1 

0 

16-bits 

1 

1 

32-bits 


When src1 is Immediate and bit 28 is set, bit zero of 
the immediate value is forced to zero. 
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For fid, fst, pfid, pst, and flush, bit 0 selects autoin- 
crement addressing if set. For fid, fst, pfId, and pst, 
bits one and two select the operand size as follows: 


Bit 1 

Bit 2 

Operand Size 

0 

0 

64-bits 

0 

1 

128-bits 

1 

0 

32-bits 

1 

1 

32-bits 


For flush, bits one and two must be zero. 


When src1 is immediate, bits zero and one of the 
immediate value are forced to zero to maintain align- 
ment. When bit one of the immediate value Is clear, 
bit two is also forced to zero. 

For the instructions Idio, stio, Idint, and scyc, the 
operand size is encoded by bits 9 and 1 0 as follows. 
For other instructions, these bits are reserved and 
should be set to zero. 


Operand Size 

Bit 10 

Bits 

8 Bits (.b) 

0 

0 

16 Bits (.s) 

0 

1 

32 Bits (.1) 

1 

0 

reserved 

1 

1 






nhMim^Fmh 



OPCODE/I 

SRC2 

DEST 

SRCl 

IMMEDIATE, OFFSET, 

OR NULL 








240874-74 





OPCODE 

D 

SRC2 

DEST 

IMMEDIATE 

\ V 






240874-75 

131 30 29 28 27 26i25 24 23 22 21/20 19 18 17 16 

/15 14 13 12 // 

/10 98765432 1 0 



OPCODE/I 

SRC2 

OFFSET 

HIGH 

SRC1 

SRC1S 

OFFSET LOW 


k 



\ \ 





240874-76 



im 


120 19 18 17 16 

/IS 14 13 12 // 

ho 98765432 1 Oi 


OPCODE 

D 

SRC2 

OFFSET 

HIGH 

IMMEDIATE 

OFFSET LOW 







240874-77 | 


Figure 10.1. REG-Format Variations 
























iSeOTM XP MICROPROCESSOR 




inl^. 


Table 10.4. REG-Format Opcodes 




31 

30 

29 

28 

27 

26 

Id.x 

Load Integer 

0 

0 

0 

L 

0 

1 

st.x 

Store Integer 

0 

0 

0 

L 

1 

1 

ixfr 

Integer to F-P Reg Transfer 

0 

0 

0 

0 

1 

0 

— 

(reserved) 

0 

0 

0 

1 

1 

0 

fid.x, fst.x 

Load/Store F-P 

0 

0 

1 

0 

LS 

1 

flush 

Flush 

0 

0 

1 

1 

0 

1 

pst.d 

Pixel Store 

0 

0 

1 

1 

- 1 

1 

Id.c, st.c 

Load/Store Control Register 

0 

0 

1 

1 

LS 

0 

bri 

Branch Indirect 

0 

1 

0 

0 

0 

0 

trap 

Trap 

0 

1 

0 

0 

0 

1 

— 

(Escape for F-P Unit) 

0 

1 

0 

0 

1 

0 

— 

(Escape for Core Unit) 

0 

1 

0 

0 

1 

1 

bte, btne 

Branch Equal or Not Equal 

0 

1 

0 

1 

E 

1 

pfid.y 

Pipelined F-P Load 

0 

1 

1 

0 

0 

1 

— 

(CTRL-Format Instructions) 

0 

1 

1 

X 

X 

X 

auiiu, subu, -s 

Add/5ubtraci 

■ 

1 : 

u 

0 

SO 

AS 

1 

shl, shr 

Logical Shift 

1 

0 

1 

0 

LR 

1 

shrd 

Double Shift 

1 

0 

1 

1 

0 

0 

bla 

. Branch LCC Set and Add 

T 

0 

1 

■ ■ 1 

0 

1 

shra 

Arithmetic Shift 

1 

0 

1 

1 

1 

1 

and(h) 

AND 

1 

1 

0 

0 

H 

1 

andnot(h) 

ANDNOT 

1 

1 

0 

1 

H 

1 

or(h) 

OR 

,1 

1 

1 

0 

H 

1 

xor(h) 

XOR 

1 

1 

1 

1 

H 

1 

— 

(reserved) 

1 

1 

X 

X 

1 

0 


L 

Integer Length 

AS 

Add/Subtract 


0 — 8 bits 


0 —Add 


1 —1 6 or 32 bits (selected by bit 0) 


1 —Subtract 

LS 

Load/Store 

lr 

Left/Right 


0 —Load 


0 —Left Shift 


1 — Store 


1 —Right Shift 

SO 

Signed/Ordinal 

E 

Equal 


0 —Ordinal 


0 —Branch on Unequal 


1 — Signed 


1 —Branch on Equal 

H 

High 

1 

Immediate 


0 — and, or, andnot, xor 


0 — src1 is register 


1 — andh, orh, andnoth, xorh 


1 — src1 is immediate 


131 30 29 28 27 26i25 24 23 22 21/20 19 18 17 16 

/IS 14 13 12 // 

/10 9/8 7 6 5/4 3 2 1 0/ 

0 10 0 1 1 

SRC2 

DEST 

SRC1 



OPCODE ; 


mmamm 



IW 


i 


□ RESERVED BY INTEL CORPORATION (SET TO ZERO) 


240874-78 

Figure 10.2. Core Escape Instructions 
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Table 10.5. Core Escape Opcodes 




4 

3 

2 

1 

0 

— 

(reserved) 

0 

0 

0 

0 

0 

lock 

Begin Interloacked Sequence 

0 

0 

0 

0 

1 

call! 

Indirect Subroutine Call 

0 

0 

0 

1 

0 

— 

(reserved) 

0 

0 

0 

1 

1 

introvr 

Trap on Integer Overflow 

0 

0 

1 

0 

0 

— 

(reserved) 

0 

0 

1 

0 

1 

— 

(reserved) 

0 

0 

1 

1 

0 

unlock 

End Interlocked Sequence 

0 

0 

1 

1 

1 

Idio* 

Load I/O 

0 

1 

0 

0 

0 

stio* 

Store I/O 

0 

1 

0 

0 

1 

Idint* 

Load Interrupt Vector 

0 

1 

0 

1 

0 

scyc* 

Special Cycles 

0 

1 

0 

1 

1 

— 

(reserved) 

0 

1 

1 

X 

X 

— 

(reserved) 

1 

0 

X 

X 

X 

— 

(reserved) 

1 

1 

X 

X 

X 


NOTE: 

"Available only with i860 XP CPU, not with i860 XR CPU 


10.2.2 CTRL-FORMAT INSTRUCTIONS 

The CTRL-Format instructions do not refer to registers; so, instead of the register fields, they have a 26-bit 
relative branch offset. Figure 10.3 shows the format of these instructions and Table 10.6 defines the encod- 
ings. 


ht 30 29/28 27 26/25 24 2J 22 2120 19 18 17 16 15 14 13 12 It W 9 8 7 6 5 4 3 2 1 Oh 

0 1 1 

OPC 

BROFFSET 

\ 

i 

i 


240874-79 

NOTE: 

BROFFSET is a signed 26-bit relative branch offset 


Figure 10.3. CTRL-Format Instructions 


Table 10.6. CTRL-Format Opcodes 

28 27 26 


— 

(reserved) 

0 

0 

0 

— 

(reserved) 

0 

0 

1 

br 

Branch Direct 

0 

1 

0 

call 

Call 

0 

1 

1 

bc(.t) 

Branch on CC Set 

1 

0 

T 

bnc(.t) 

Branch on CC Clear 

1 

1 

T 


T Taken 


0 —be or bnc 

1 — bc.t or bnc.t 


10.2.3 FLOATING-POINT INSTRUCTION 
ENCODING 

The floating-point instructions also constitute an es- 
cape series. All these instructions begin with the bit 
sequence 010010. Figure 10.4 shows the format of 


the floating-point instructions, and Table 10.7 gives 
the encodings. Within the dual-operation instructions 
is a subcode DPC whose values are given in Table 
10.9 along with the mnemonic that corresponds to 
each. 
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iJf JO 29 23 27 26/25 24 2322 2f/20 f9 tS 17 16 

/15 14 13 12 11 

/l0/9l8/7^ 

/6 5 4 3 2 1 Ok 

0 10 0 10 

SRC2 

DEST 

SRC1 


D 

0 

0 

OPCODE 



C 


c 

^ ^ 


SRC1, SRC2 Source; one of 32 floating-point registers 

DEST Destination; one of 32 floating-point registers (except fxfr; one of 32 integer registers) 

P Pipelining 

1 Pipelined instruction mode 

0 Scalar instruction mode 

D Dual-Instruction Mode 

1 Dual-instruction mode 

0 Single-instruction mode 

S Source Precision 

1 Double-precision source operands 

0 Single-precision source operands 

R Result Precision 

1 Double-precision result 

0 Single-precision result 


Figure 10.4. Floating-Point Instruction Encoding 
Table iO.7. Floating-Point Opcodes 




6 

5 

4 

3 

2 

1 

0 

pfam 

Add and Multiply* 

0 

0 

0 


npr 


pfmam 

Multiply with Add* 








pfsm 

Subtract and Multiply* 

n 

Q 

•\ 


DPC 


pfmsm 

Multiply with Subtract* 








(p)fmul 

Multiply 

0 

1 

0 

0 

0 

0 

0 

fmlow 

Multiply Low 

0 

1 

0 

0 

0 

0 

1 

frcp 

Reciprocal 

0 

1 

0 

0 

0 

1 

0 

frsqr 

Reciprocal Square Root 

0 

1 

0 

0 

0 

1 

1 

pfmul3.dd 

3-Stage Pipelined Multiply 

0 

1 

0 

0 

1 

0 

0 

(p)fadd 

Add 

0 

1 

1 

0 

0 

0 

0 

(p)fsub 

Subtract 

0 

1 

1 

0 

0 

0 

1 

(p)fix 

Fix 

0 

1 

1 

0 

0 

1 

0 

(p)famov 

Adder Move 

0 

1 

1 

0 

0 

1 

1 

pfgt/pfle** 

Greater Than 

0 

1 

1 

0 

1 

0 

0 

pfeq 

Equal 

0 

1 

1 

0 

1 

0 

1 

(p)ftrunc 

Truncate 

0 

1 

1 

1 

0 

1 

0 

fxfr 

Transfer to Integer Register 

1 

0 

0 

0 

0 

0 

0 

(p)fiadd 

Long-Integer Add 

1 

0 

0 

1 

0 

0 

1 

(p)fisub 

Long-Integer Subtract 

1 

0 

0 

1 

1 

0 

1 

(p)fzchkl 

Z-Check Long 

1 

0 

1 

0 

1 

1 

1 

(p)fzchks 

Z-Check Short 

1 

0 

1 

1 

1 

1 

1 

(p)faddp 

Add with Pixel Merge 

1 

0 

1 

0 

0 

0 

0 

(p)faddz 

Add with Z Merge 

1 

0 

1 

0 

0 

0 

1 

(p)form 

OR with MERGE Register 

1 

0 

1 

1 

0 

1 

0 


NOTE: 

All opcodes not shown are reserved. 

* pfam and pfsm have P-bit set; pfmam and pfmsm have P-bit clear. 
** pfgt has R bit cleared; pfle has R bit set. 
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Table 10.8. DPC Encoding 


DPC 

PFAM 

PFSM 

M-Unit 

M-Unit 

A-Unit 

A-Unit 

T 

K 

Mnemonic 

Mnemonic 

Ol 

op2 

op1 

op2 

Load 

Load* 

0000 

r2p1 

r2s1 

KR 

src2 

srcl 

M result 

No 

No 

0001 

r2pt 

r2st 

KR 

src2 

T 

M result 

No 

Yes 

0010 

r2ap1 

r2as1 

KR 

src2 

srcl 

A result 

Yes 

No 

0011 

r2apt 

r2ast 

KR 

src2 

T 

A result 

Yes 

Yes 

0100 

i2p1 

i2s1 

Kl 

src2 

srcl 

M result 

No 

No 

0101 

i2pt 

i2st 

Kl 

src2 

T 

M result 

No 

Yes 

0110 

i2ap1 

i2as1 

Kl 

src2 

srcl 

A result 

Yes 

No 

0111 

i2apt 

i2ast 

Kl 

src2 

T 

A result 

Yes 

Yes 

1000 

rat1p2 

rati s2 

KR 

A result 

srcl 

src2 

Yes 

No 

1001 

m12apm 

m12asm 

srcl 

src2 

A result 

M result 

No 

No 

1010 

ra1p2 

ra2s2 

KR 

A result 

srcl 

src2 

No 

No 

1011 

m12ttpa 

m12ttsa 

srcl 

src2 

T 

A result 

Yes 

No 

1100 

iat1p2 

iat1s2 

Kl 

A result 

srcl 

src2 

Yes 

No 

1101 

m12tpm 

m12tsm 

srcl 

src2 

T 

M result 

No 

No 

1110 

ia1p2 

ia1s2 

Kl 

A result 

srcl 

src2 

No 

No 

1111 

m12tpa 

m12tsa 

srcl 

src2 

T 

A result 

No 

No 

DPC 

PFMAM 

PFMSM 

M-Unit 

M-Unit 

A-Unit 

A-Unit 

T 

K 

Mnemonic 

Mnemonic 

opi 

op2 

opi 

op2 

Load 

Load* 

0000 

mr2p1 

mr2s1 

KR 

1 

src2 

srcl 

M result 

No 

No 

0001 

mr2pt 

mr2st 

KR 

src2 

T 

M result 

No 

Yes 

0010 

mr2mp1 

mr2ms1 

KR 

src2 

srcl 

M result 

Yes 

No 

0011 

mr2mpt 

mr2mst 

KR 

src2 

T 

M result 

Yes 

Yes 

0100 

mi2p1 

mi2s1 

Kl 

src2 

srcl 

M result 

No 

No 

0101 

mi2pt 

mi2st 

Kl 

src2 

T 

M result 

No 

Yes 

0110 

mi2mp1 

mi2ms1 

Kl 

src2 

srcl 

M result 

Yes 1 

No 

0111 

mi2mpt 

mi2mst 

Kl 

src2 

T 

M result 

Yes 

Yes 

1000 

mrmt1p2 

mrmtl s2 

KR 

M result 

srcl 

src2 

Yes 

No 

1001 

mm12mpm 

mm12msm 

srcl 

src2 

M result 

M result 

No 

No 

1010 

mrm1p2 

mrm1s2 

KR 

M result 

srcl 

src2 

No 

No 

1011 , 

mm12ttpm 

mm12ttsm 

srcl 

src2 

T 

M result 

Yes 

No 

1100 

mimt1p2 

mimt1s2 

Kl 

M result 

srcl 

src2 

Yes 

No 

1101 

mm12tpm 

mm12tsm 

srcl 

src2 

T 

M result 

No 

No 

1110 

mim1p2 

mim1s2 

Kl 

M result 

srcl 

src2 

No 

No 

1111 

Intel Reserved 


NOTE: 

* If K>load is set, KR is loaded when operand-1 of the multiplier is KR; Kl is loaded when operand-1 of the multiplier is Kl. 
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1 0.3 Instruction Timings 

Generally, i860 XP microprocessor instructions take 
one clock to execute unless a freeze condition is 
invoked. Detailed times, along with freeze conditions 
and their associated delays, are shown in the table 
on the following pages. The following symbols are 
used for brevity in the timing table: 

+/7 n clocks must be added to the execution 
time if the stated conditions apply. 

< — ► n The processor requires at least n clocks be- 
tween the indicated instructions. The actual 
delay will be n minus the number of clocks 
for executing intervening instructions (or 
dual-mode pairs). If the time for intervening 
instructions is ^ n, there is no delay. 
n..m Indicates a range of clocks. These cases 
are accompanied by a reference to a note 
where further explanation is available. 

XR: Applies to i860 XR microprocessors only. 

XP: Applies to i860 XP microprocessors only. 

OA The number of clocks to finish all outstand- 
ing accesses. 

R1 The number of clocks from ADS# through 
the first READY# (80860XR) or BRDY# 
(80860XP) of the indicated bus activity. 

R2 The number of clocks from ADS# through 
the second READY# or BRDY#. 

RL The number of clocks from ADS# through 
the last READY# or BRDY#. 

RL1 XP: The number of clocks through last 
BRDY# of first access. 

RN XR: The number of clocks until next nonre- 
peated address can be issued (i.e., an ad- 
dress that is not the 2nd-4th cycle of a 
cache fill, the 2nd -8th cycle of a CS8 mode 
instruction fetch, nor the 2nd cycle of a 1 28- 
bit write). 

RX The number of clocks through READY# or 
BRDY# for the next 64-bit-or-less write cy- 
cle or second READY# or BRDY# for the 
next 1 28-bit write cycle. 

NOTES: 

a. “Address path full” means one address inter- 
nally waiting for bus while external bus pipeline 
full. 


b. “Store path full” means two stores or one 256- 
bit write-back internally waiting for bus plus ex- 
ternal bus pipeline full. 

c. If a floating-point instruction, graphics-unit in- 
struction, fst, or pst is executed when a scalar 
floating-point operation (other than frcp or 
frsqr) is in progress, the scalar operation must 
complete first: two additional clocks for fadd, 
fix, fmlow, fmul.ss, fmul.sd, ftrunc, and 
fsub; three additional clocks for fmul.dd. Add 
one if either or both of these situations occur: 

1 . There is an overlap between the result reg- 
ister of the previous scalar operation and 
the source of the floating-point operation, 
and the destination precision of the scalar 
operation differs from the source precision 
of the floating-point operation. 

2. The floating-point operation is pipelined 
and its destination is not fO. 

TLB TLB miss. Five clocks plus the number of 
clocks to finish two reads plus the number of 
clocks to set A-bits (If necessary). 

In addition, any Instruction may be delayed due to an 
instruction cache miss or TLB miss during the in- 
struction fetch. The time for a TLB miss Is shown 
above in note TLB. An instruction cache miss adds 
the following delays: 

• The number of clocks to get the next instruction 
from the bus (ADS# clock to first READY# or 
BRDY# clock, inclusive). 

• XR: When any of the instructions in the new in- 
struction-cache line is a branch or call or causes 
a freeze, the time through the last READY# for 
the new line. 

• If the data cache is being accessed when the in- 
struction-cache miss occurs, two clocks for data 
cache miss; one clock for hit. 

Not included in the table is the delay caused by a 
trap. This depends on the trap handler. 

In dual Instruction mode, each pair of instructions 
requires the maximum of the times required by each 
individual instruction. 
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Instruction 

Execution 

Clocks 

Condition 

adds 

1 


addu 

1 


and 

1 


andh 

1 


andnot 

1 


andnoth 

1 


be 

1 

If branch not taken. 


2 

If branch taken. 


+ 

If the prior instruction is addu, adds, subu, subs, pfeq, or pfgt. 

bc.t 

1 

If branch taken. 


2 

If branch not taken. 


+ 1 

If the prior instruction is addu, adds, subu, subs, pfeq, or pfgt. 

bla 

1 

If branch taken. 


• 2 

If branch not taken. 

bnc 


(same as be) 

bnc.t 


(same as bc.t) 

br 

1 


bri 

2 


bte 

1 

If branch not taken. 


3 

If branch taken. 

btne 


(same as bte) 

call 

1 



+ 1 

If r1 referenced In next instruction. 


+ 1 + R1 

If data cache load miss in progress for a read of less than 1 28 bits. 


+ 1+R2 

If data cache load miss in progress for 128-bit read. 

call! 

2 



+ 1 

If r1 referenced In next instruction. 


+ 1+R1 

If data cache load miss In progress for a read of less than 1 28 bits. 


+ 1+R2 

If data cache load miss in progress for 1 28-blt read. 

fadd.p 

1 

( . . . and all other A-unit instructions except dual operations) 



If executed when a scalar floating-point operation (other than frep 
or frsqr) is in progress.(c) 
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Instruction 


Execution 

Clocks 


Condition 


faddp 


1 ( . . . and all other G-unit instructions except fiadd.w, fxfr) 

+ 1 If fdest is used by next instruction and next instruction is G-, M- or A-unit Instruction 
< — ► 2..4 If executed when a scalar floating-point operation (other than frcp or frsqr) is in 
progress.(c) 


faddz (same as faddp) 

famov.r (same as fadd.p) 
fiadd.w 1 

+ 1 If fdest is used by next instruction and next instruction is M- or A-unit Instruction 
(except when fiadd Is used for fmov.dd or fmov.ss). 

+ 1 If fdest is used by next instruction and next instruction is G-unit instruction. 

< — > 2. A If executed when a scalar floating-point operation (other than frcp or frsqr) is In 
progress.(c) 


fisub.w (same as faddp) 

flx.v (same as fadd.p) 


fld.y 1 

+ 1 If this Is the instruction after a st, fst or pst that hits the data cache. 

< — > 2 If fdest is referenced in the next two instructions. 

+ 1+ R1 If 32-blt fid.l or 64-bit fid.d misses the data cache. 

+ 1 + R2 If 1 28-blt f Id.q misses the data cache. 

+ 1+ RL If data cache load miss in progress (except in the following case). 

< — ► 2 XP: If this Instruction follows a data cache access that misses In the virtual tags but 
hits in the physical tags. 

+ 2 XP: If the prior instruction Is a pfid.y that hits a modified line In the data cache. 

+ R2 XP: If data-cache line write-back due to snoop Is in progress. 

+ RN XR: If address path full.(a) 

+ RL1 XP: If address path full.(a) 

+ TLB If TLB miss. 


flush 1 

< — ► 3 XR: If preceded by another flush. 

< — ► 2 XP: If preceded by another flush. 

+ R2 XP: If data-cache line write-back due to snoop is In progress. 
+ 1 4- RX If flush to modified line when store path full.(b) 

+ TLB If TLB miss. 


fmlow.dd 


1 ( . . . and all other M-unIt instruction except dual operations) 

+ 1 If fsrd refers to result of the prior operation (either scalar or pipelined). 

+ 1 If the prior operation is a double-precision multiply. 

< — ► 2..4 If executed when a scalar floating-point operation (other than frcp or frsqr) Is in 
progress, (c) 


fmov.r 

fmov.ss and fmov.dd same as fiadd.w 
fmov.sd and fmov.ds same as fadd.p 

fmul.p (same as fmlow.dd) 
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Instruction 


Execution 

Clocks 


Condition 


fnop 

form 

frcp.p 

frsqr.p 

fst.y 


fsub.p 

ftrunc.v 

fxfr 


fzchki 

fzchks 

Intovr 

Ixfr 


ld.c 


(same as faddp) 
(same as fmlow.dd) 
(same as fmlow.dd) 


+ 1 If followed by pipelined floating-point operation that oven^^rites the register 
being stored. 

+ 1 + RL If data cache load miss in progress. 

+ 2 XP: If the prior instruction is a pfid.y that hits a modified line in the data cache. 

< — > 2 XP: If this instruction follows a data cache access that misses in the virtual 

tags but hits in the physical tags. 

+ R2 XP: If data-cache line write-back due to snoop is in progress. 

< — ► 2.. 4 If executed when a scalar floating-point operation (other than frcp or frsqr) is 

in progress.(c) 

+ RN XR: If address path full.(a) 

+ RL1 XP: If address path full.(a) 

+ 1 + RX If cache miss when store path full.(h) 

+ TLB If TLB miss. 


(same as fadd.p) 

(same as fadd.p) 

1 

+ 1 If idest referenced In next Instruction. 

+ 1 + R1 If data cache load miss in progress for 64-bit read. 

4- 1+ R2 If data cache load miss in progress for 1 28-blt read. 

< — > 2. A If executed when a scalar floating-point operation (other than frcp or frsqr) is 

In progress. (c) 

(same as faddp) 

(same as faddp) 

1 

1 

+ 1+ R1 If data cache load miss in progress for 64-bit read. 

+ 1 + R2 If data cache load miss in progress for 1 28-bit read. 

< — > 2 If fdest Is referenced in the next two instructions. 

1 

+ 1 If /dost referenced in next instruction. 

+ 1 + R 1 If data cache load miss In progress for 64-bit read. 

+ 1 + R2 If data cache load miss in progress for 1 28-blt read. 
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Instruction 


Execution 

Ciocks 


Condition 


ld.x 


1 

+ 1 
+ 1 
+ H-RL 
1+R1 


+ 2 
+ R2 
+ RN 
+ RL1 
+ H-RX 
+ TLB 


If idest referenced in next instruction. 

If this is the instruction after a st, fst or pst that hits the data cache. 

If data cache load miss in progress. 

If ld.x misses the data cache and a subsequent instruction references the 
idest oi the ld.x (except for following case). 

XP: If this instruction follows a data cache access that misses in the virtual 
tags but hits in the physical tags. 

XP: If the prior Instruction is a pfid.y that hits a modified line In the data cache. 
XP: If data-cache line write-back due to snoop is in progress. 

XR: If address path full.(a) 

XP: If address path full.(a) 

If cache miss when store path full.(h) 

If TLB miss. 


Idintx 

1 + OA 

ldio.x 

1 + OA 

lock 

1 

mov 

1 

nop 

1 

or 

1 

orh 

1 

pfadd.p 

(same as fadd.p) 

pfaddp 

(same as faddp) 

pfaddz 

(same as faddp) 

pfam.p 

1 ( . . . and all oti 

+ 1 If /src/ refers t 

+ 1 If the prior ope 

< — > 2..4 If executed wh 

in progress.(c) 

pfamov.r 

(same as fadd.p) 

pfeq.p 

(same as fadd.p) 

pfgt.p 

(same as fadd.p) 

pfiadd.w 

(same as faddp) 

pfisub.w 

(same as faddp) 

pfix.v 

(same as fadd.p) 
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Instruction 


pfid.y 


pfie.p 


Execution 

Clocks 


Condition 


1 

+ 1 + RL 

+ 1+RL1 
+ 2 + OA 
+ 2 

+ R2 
+ RN 
+ RL1 
+ TLB 


If data cache load miss in progress. 

If fdest is referenced in the next two instructions. 

If three pfid’s are outstanding. 

XR: If pfid hits data cache. 

XP: If the prior instruction is a pfid.y that hits a modified line in the 
data cache. 

XP: If this instruction follows a data cache access that misses in 
the virtual tags but hits In the physical tags. 

XP: If data-cache line write-back due to snoop is in progress. 

XR: If address path full.(a) 

XP: If address path full.(a) 

If TLB miss. 


1 


pfmam.p (same as pfam.p) 

pfmov.r 

pfmov.ss and pfmov.dd same as faddp 
pfmov.sd and pfmov.ds same as fadd.p 


pfmsm.p 

(same as pfam.dd) 

pfmul.p 

(same as fmlow.dd) 

pfmuiS.dd 

(same as fmlow.dd) 

pform 

(same as faddp) 

pfsm.p 

(same as pfam.dd) 

pfsub.p 

(same as fadd.p) 

pftrunc.v 

(same as fadd.p) 

pfzchki 

(same as faddp) 

pfzchks 

(same as faddp) 

pst.d 

(same as fst.d) 

scyc.x 

1 + OA 

Shi 

1 

shr 

1 

shra 

1 

shrd 

1 

st.c 

3 


+ 1 + R1 If data cache load miss in progress for a read of less than 1 28 bits. 

+ 1 + R2 If data cache load miss in progress for 1 28-blt read. 
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Instruction 

Execution 

Clocks 

Condition 

st.x 

1 

+ 1 + RL 
+ 2 

+ R2 
+ RN 
+ RL1 
+ 1+RX 
+ TLB 

If data cache load miss in progress. 

XP: If the prior instruction is a pfid.y that hits a modified line in the data cache. 
XP: If this instruction follows a data cache access that misses in the virtual 
tags but hits in the physical tags. 

XP: If data-cache line write-back due to snoop is In progress. 

XR: If address path full.(a) 

XP: If address path full.(a) 

If cache miss when store path full.(h) 

If TLB miss. 

stio.x 

1 + OA 


subs 

1 


subu 

1 


trap 

1 


unlock 

1 


xor 

1 


xorh 

1 



10.4 Instruction Characteristics 

The following table lists some of the characterisics 
of each instruction. The characteristics are: 

• What processing unit executes the instruction. 
The codes for processing units are: 

A Floating-point adder unit 
E Core execution unit 
G Graphics unit 
M Floating-point multiplier unit 

• Whether the instruction is pipelined or not. A P 
indicates that the instruction is pipelined. 

• Whether the instruction is a delayed branch in- 
struction. A D marks the delayed branches. 

• Whether execution is suppressed in user mode. 
An SU marks supervisor-only instructions. 

• Whether the instruction is available on both the 
i860 XR and i860 XP microprocessors. An XL 
marks instructions that are available only on the 
i860 XP microprocessor. 

• Whether the instruction changes the condition 
code CC. A CC marks those instructions that 
change CC. 

• Which faults can be caused by the instruction. 
The codes used for exceptions are: 

IT Instruction Fault 

SE Floating-Point Source Exception 


RE Floating-Point Result Exception, Including 
overflow, underflow. Inexact result 

DAT Data Access Fault 

Note that this is not the same as specifying at which 
instructions faults may be reported. A result excep- 
tion is reported on the subsequent floating-point in- 
struction, pst, fst, or sometimes fid, pfid, and ixfr. 

The Instruction access fault lAT and the interrupt 
trap IN are not shown In the table because they can 
occur for any instruction. 

® Performance notes. These comments regarding 
optimum performance are recommendations only. 
If these recommendations are not followed, the 
I860 XP microprocessor automatically waits the 
necessary number of clocks to satisfy internal 
hardware requirements. The following notes de- 
fine the numeric codes that appear in the instruc- 
tion table: 

1 . The following instruction should not be a condi- 
tional branch (be, bnc, bc.t, or bnc.t). 

2. The destination should not be a source oper- 
and of the next two Instructions. 

3. A load should not directly follow a store that is 
expected to hit in the data cache. 

4. When the prior instruction is scalar, fsrd 
should not be the same as the fdest of the prior 
operation. 
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5. The fdest should not reference the destination 
of the next instruction if that instruction is a 
pipelined floating-point operation. 

6. The destination should not be a source oper- 
and of the next instruction. (For call and calli, 
the destination is r1.) 

7. When the prior operation is scalar and multipli- 
er opi is fsrd, fsrc2 should not be the same as 
the fdest of the prior operation. 

8. When the prior operation is scalar, src1 and 
src2 of the current operation should not be the 
same as dest of the prior operation. 

9. A pfid should not immediately follow a pfid. 

• Programming restrictions. These indicate combi- 
nations of conditions that must be avoided by pro- 
grammers, assemblers, and compilers. The fol- 
lowing notes define the alphabetic codes that 
appear in the instruction table: 

a. The sequential instruction following a delayed 
control-transfer instruction may not be another 
control-transfer Instruction, nor a trap Instruc- 
tion, nor the target of a control-transfer instruc- 
tion. 


b. When using a bri to return from a trap handler, 
programmers should take care to prevent traps 
from occurring on that or on the next sequen- 
tial instruction. IM should be zero (interrupts 
disabled) when the bri Is executed. 

c. If fdest is not zero, fsrd must not be the same 
as fdest. 

d. When fsrd goes to multiplier op1 or to KR or 
Kl, fsrd must not be the same as fdest. 

e. If dest is not zero, srd and src2 must not be 
the same as dest. 

f. Isrd must not be the same register as isrc2 for 
the autoincrementing form of this Instruction. 

g. Isrd must not be the same register as isrc2. 

h. flush must not be used in a locked sequence 
or in dual Instruction mode. 


Instruction 

Execution 

Unit 

Pipelined? 
Delayed? 
Supervisor? 
iSeOTM XP Only? 

— 

, 

Sets 

CC? 

Faults 

Performance 

Notes 

Programming 

Restrictions 

adds 

E - 


cc 


1 


addu 

E 


CC 


1 


and 

E 


cc 




andh 

E 


cc 




andnot 

E 


cc 




andnoth 

E 


cc 




be 

E 






bc.t 

E 

D 




a 

bia 

E 

D 




a,g 

bnc 

E 






bnc.t 

E 

D 




a 

br 

E 

D 




a 

bri 

E 

. D 




a,b 

bte 

E 






btne 

E 






call 

E 

D 



6 

a 

calli 

E 

D 



6 

a 

fadd.p 

A 1 



SE,RE 



faddp 

G 




8 


faddz 

G 




8 


famov.r 

A 



SE,RE 



fiadd.w 

G 






fisub.w 

G 




8 


fix.p 

A 



SE.RE 



fld.y 

E 



DAT 

2,3 

f 


NOTES: 

* On the i860 XP microprocessor, the pipelined instructions can generate ITR with PI. 

** On the i860 XR micropocessor, the 128-bit pfid.q is not available. If used it causes an instruction trap. 
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Instruction 

Execution 

Unit 

Pipelined? 
Delayed? 
Supervisor? 
I860TM XP Only? 

Sets 

CC? 


Performance 

Notes 

— 

Programming 

Restrictions 

flush 

E 





h 

fmlow.dd 

M 




4 


fmul.p 

M 



SE,RE 

4 


form 

G 




8 


frcp.p 

M 



SE,RE 



frsqr.p 

M 



SE,RE 



fst.y 

E 



DAT 

5 

f 

fsub.p 

A 



SE,RE 



ftrunc.p 

A 



SE,RE 



fxfr 

G 




6,8 


fzchki 

G 




8 


fzchks 

G 




8 


Intovr 

E 



IT 



ij^ii 

i::. 




2 


Id.c 

E 






!d.x 




DAT 

6 


ldint.x 




DAT 



Idio.x 




DAT 



lock 







or 



cc 




orh 



CC 




pfadd.p 

A 

P 


SE,RE* 



pfaddp 

G 

P 


* 

8 

e 

pfaddz 

G 

P 



8 

e 

pfam.p 

A&M 

P 


SE,RE* 

7 

d 

pfamov.r 

A 

P 


SE.RE* 



pfeq.p 

A 

P 

cc 

SE* 

1 


pfgt.p 

A 

P 

cc 

SE* 

1 


pfiadd.w 

G 

P 


* 

8 

e 

pfisub.w 

G 

P 


♦ 

8 

e 

pfix.p 

A 

P 


SE,RE* 



pfid.y 

E 

P.(XP)** 


DAT* 

2.9 

f 

pfmam.p 

A&M 

P 


SE,RE* 

7 . 

d 

pfmsm.p 

A&M 

P 


SE,RE* 

7 

d 

pfmul.p 

M 

P 


SE,RE* 

4 

c 

pfmulS.dd 

M 

P 


SE.RE* 

4 

c 

pform 

G 

P 


♦ 

8 

e 

pfsm.p 

A&M 

P 


SE,RE* 

7 

d 

pfsub.p 

A 

P 


SE.RE* 



pftrunc.p 

A 

P 


SE,RE* 



pfzchki 

G 

P 


* 

8 



NOTES: 

* On the i860 XP microprocessor, the pipelined instructions can generate ITR with PI. 

** On the i860 XR micropocessor, the 128-bit pfid.q is not available. If used it causes an instruction trap. 
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Instruction 

Execution 

Unit 

Pipelined? 
Delayed? 
Supervisor? 
i860TM XP Only? 



Sets 

CC? 

Faults 

Performance 

Notes 

Programming 

Restrictions 

pfzchks 

G 

P 


♦ 

8 


pst.d 

E 



DAT 

5 

f 

scyc.x 

E 

SU,XP 


DAT 



shl 

E 






shr 

E 






shra 

E 






shrd 

E 






st.c 

E 






st.x 

E 



DAT 



stio.x 

E 

SU.XP 


DAT 



subs 





1 


subu 





1 


trap 




IT 



unlock 







xor 







xorh 








NOTES: 

*On the i860 XP microprocessor, the pipelined instructions can generate ITR with PI. 

**On the I860 XR micropocessor, the 128-bit pfid.q is not available. If used it causes an instruction trap. 


10.5 Software Compatibility 

10.5.1 REQUIRED CHANGES 

To port existing systems software from the i860 XR 
microprocessor to the i860 XP microprocessor, the 
following changes may be required. Applications 
software does not require changes. 

1 . Data cache flush. All four ways of the data cache 
must be flushed on the i860 XP microprocessor. 
The cache flush routine can be modified to check 
processor type in epsr or the DCS field of 
dirbase and flush the appropriate number of 
ways. 

2. Parity and bus error traps. If the i860 XP system 
signals these errors, the trap handler must be ex- 
tended to handle them. Software must avoid test- 
ing the BEF and PEF bits unless executing on the 
i860 XP microprocessor. 

3. LOCK# deactivation. On the i860 XP microproc- 
essor, traps do not automatically deactivate the 
LOCK# signal, so the trap handler must do a 
data access to deactivate LOCK#. Trap handlers 
that already access data soon after invocation do 
not require this modification. 

4. Load pipe precision. The precision of the last 
stage of the load pipeline Is specified by the LRP 
bit on the I860 XR microprocessor but by the 
LRPO and LRP1 bits on the i860 XP microproces- 


sor. The procedure that restores the load pipe 
must check the processor type, use the appropri- 
ate bits, and restore the correct precision. Pipe 
restoration code for the I860 XR microprocessor 
will work correctly on the I860 XP microprocessor 
if pfid.q is not used. 

5. Pre-accessed trap handler pages. Page-directory 
and page-table entries for the Instruction pages 
of the trap handler and for the first data page 
accessed by the trap handler must always have 
A = 1 . Software modified to allocate page tables 
this way works on both i860 XR and i860 XP mi- 
croprocessors. 

6. Page directory entry bit 7 must be zero. This is 
the bit that selects four Mbyte or four Kbyte page 
size. On the i860 XR mlcroprocessorj It Is re- 
served diX\6 should be set to zero. It must be set 
to zero for four Kbyte pages to work on the 
I860 XP microprocessor. 

10.5.2 PERFORMANCE OPTIMIZATIONS 

Software developers may wish to make the following 
performance enhancements in systems software for 
the I860 XP microprocessor. Systems software that 
must execute on both i860 XP and i860 XR systems 
can contain code both with and without the optimiza- 
tions. By testing the processor type, the appropriate 
instruction path can be determined. 
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1. Data cache flush. On the i860 XP microproces- 
sor, a complete flushing of the data cache is not 
needed when changing context or marking a 
page not present. 

2. The epsr bits Al, Dl, PI, and PT can be used on 
the I860 XP microprocessor to make trap han- 
dlers more efficient. 

3. Four-Mbyte pages can be allocated to frame buff- 
ers and the operating-system kernel, thereby re- 
ducing the cost of TLB misses. 

10.5.3 NEW FEATURES 

Software that uses the new features available only 
on the i860 XP microprocessor will not be compati- 
ble with the I860 XR microprocessor unless alter- 
nate instruction paths are provided. 

Systems software features: 

1. New instructions Idio, stio, Idint, and scyc. 

2. Four-Mbyte pages. 

3. Privileged Registers pO, pi, p2, and p3. 

4. Concurrency control unit. 

5. 1 28-bit load instruction pf Id.q. 

6. Support for virtual address aliases. 

Applications software features: 

1 . Concurrency control unit. 

2. 128-bit load instruction pfid.q. The i860 XR mi- 
croprocessor traps on pfid.q; therefore, software 
has the opportunity to emulate a pfid.q with two 
pfid.d instructions. However, this strategy does 
not yield optimal performance on the I860 XR mi- 
croprocessor. 

10.5.4 NOTES 

On the i860 XP microprocessor, pages with WT = 1 
are cached with the write-through policy; whereas, 
on the i860 XR microprocessor, they are not cached 
at all. Because this change in the function of WT 
was anticipated in the I860 XR microprocessor docu- 
mentation, no incompatibility should arise. 


11.0 REVISION HISTORY 


DATA SHEET REVISION REVIEW 

The following list represents the major differences 

between version 002 and version 001 of the i860 XP 

Microprocessor Data Sheet. 

Section 2.2.4 Al bit has been changed to TAI In 
Figure 2.5. The explanation for PI 
bit has been expanded. 

Section 4.2.33 PCHK# signal description has 
been expanded. 

Section 4.2.35 Output buffer configuration has 
been added In PEN# signal de- 
scription. 

Section 4.2.37 RESET description has been ex- 
panded. 

Section 5.1.3 Table 5.2 has been corrected. 

The explanation of wriie/reaa ana 
read/write pipelining has been re- 
vised. 

Section 5.2.2.4-5 The explanation of late back-off 
mode has been expanded. 

Section 5.2.4 Figure 5.27 has been corrected. 

Section 5.3.4 The explanation of EWBE# tim- 
ing has been corrected. 

Section 5.5 RESET Initialization description 

has been expanded. 

Section 9.2 D.C. Characteristics are correct- 
ed. 

Section 9.3 A.C. Characteristics are replaced 

with nominal timings based on 
Cl = 0 pF. 

Figure 9.3 and Figure 9.4 have 
been replaced with nominal A.C. 
timings based on Cl = 0 pF. 
Figure 9.5 has been corrected for 
normal and high-current output 
buffers. 

Section 9.4, Cpmponent buffer model has 

been added. 

Section 1 0.4 Programming restriction on flush 

Instruction has been added. 
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A 

8-bit pixel 

data type, 2.1.4 

16-bit pixel 

data type, 2.1.4 

1 6-bit values 

alignment requirements, 2.3 

32-bit binary floating-point 
single-precision real, 2.1.3 

32-bit integer 
data type, 2.1.1 

32-bit ordinal 
data type, 2.1.2 

32-blt pixel 

data type, 2.1.4 

32-bit values 

alignment requirements, 2.3 

64-bit binary floating-point 
double-precision real, 2.1.3 
floating-point register file, 2.2.2 

64-bit integer 
data type, 2.1.1 

floating-point register file, 2.2.2 
64-bit values 

alignment requirements, 2.3 

1 28-bit load and store instructions 
floating-point register file, 2.2.2 

128-bit values 

alignment requirements, 2.3 

82495XP/82490XP cache 
BRDY# (burst ready), 4.2.7 
external secondary cache, 1 .0 
write-once policy, 3. 2.4.2 

A31~A3 (address pins) 
signal description, 4.2.1 

A (accessed) 

page-table entries (PTEs), 2.4.4.6 


AA 

fsr U-bIt (update bit), 2.2.8 
access rights 

address translation caches, 3.1 

A.C. characteristics 
electrical data, 9.3 

addressing 

i860 XP microprocessor, 2.3 
modes, 2.7 

address space 
consistency, 3.3.1 

address translation 
algorithm, 2.4.5 
caches, 3.1 
faults, 2.4.6 
P (present) bit, 2.4.4.2 
virtual addressing, 2.4 

adds (Add Signed) 

epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
Instruction timing, 1 0.3 

addu (Add Unsigned) 

epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
instruction timing, 1 0.3 

ADS# (address status) 

AHOLD (address hold), 4.2.3 
signal description, 4.2.2 

AE 

fsr U-bIt (update bit), 2.2.8 

AHOLD (address hold) 
bus arbitration, 5.2 
signal description, 4.2.3 

algorithm 

address translation, 2.4.5 
cache replacement, 3.2.3 

aliasing 

instruction cache, 3.2.2 

internal Instruction and data caches, 3.2 
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alignment 

requirements, 2.3 

andh (Logical AND High) 
instruction definition, 10.1 
instruction timing, 10.3 

and (Logical AND) 

Instruction definition, 10.1 
Instruction timing, 10.3 

andnoth (Logical AND NOT High) 
instruction definition, 10.1 
instruction timing, 1 0.3 

andnot (Logical AND NOT) 
instruction definition, 10.1 
instruction timing, 10.3 

ANSI/IEEE Standard, 754 to 1985, 1.0 

AO 

fsr U-bIt (update bit), 2.2.8 

arbitration 

bus operation, 5.2 
HOLD and HLDA, 5.2.1 

ATE (address translation enable) 
address translation, 2.4 
dirbase format description, 2.2.6 

AU 

fsr U-bit (update bit), 2.2.8 

B 

back-off 

bus cycle, 5.2.2 
late modes, 5.2.2.3 
one-clock late mode, 5.2.2.4 
two-clock late mode, 5.2.2.5 

be (Branch on CC) 

instruction definition, 10.1 
instruction timing, 10.3 

bc.t (Branch on CC, Taken) 
instruction definition, 10.1 
instruction timing, 10.3 


BE7#-BE0# (byte enables) 
signal description, 4.2.4 

bear (bus error address register) 
format description, 2.2.10 

BE (big endian) 
data cache, 3.2.1 
epsr format description, 2.2.4 

BEF (bus error flag) 

epsr format description, 2.2.4 

BEn# 

BE7#-BE0# (byte enables), 4.2.4 

BERR (bus error) 

bear (bus error address register), 2.2.10 

bus error trap, 2.o.r 

epsr BEF (bus error flag), 2.2.4 

psr IM (Interupt mode), 2.2.3 

signal description, 4.2.5 

big endian mode 
addressing, 2.3 

bla (Branch on LCC and Add) 

epsr Al (trap on autoincrement instruction), 2.2.4 
Instruction definition, 10.1 
instruction timing, 10.3 

BL (bus lock) 

dirbase format description, 2.2.6 

bnc (Branch on Not CC) 
instruction definition, 10.1 
instruction timing, 10.3 

bnc.t (Branch on Not CC, Taken) 
instruction definition, 10.1 
instruction timing, 10.3 

BOFF# (back-off) 

ADS# (address status), 4.2.2 
BERR (bus error), 4.2.5 
bus arbitration, 5.2 

dirbase LB (late back-off mode), 2.2.6 
FLINE# choice, 5.3.5.1 
signal description, 4.2.6 
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boundary scan 

register cell ordering, 6.5 

BPR (bypass register) 
test, 6.2 

br (Branch Direct Unconditionally) 
instruction definition, 10.1 
instruction timing, 10.3 

BR (break read) 

debugging i860 XP microprocessor, 2.9 
psr format description, 2.2,3 

BRDY # (burst ready) 

bear (bus error address register), 2.2.10 

BERR (bus error), 4.2.5 

epsr IL (interlock), 2.2.4 

locked access, 3.2.4.3 

signal description, 4.2.7 

write-once policy, 3.2.4.2 

BREQ (bus request) 
signal description, 4.2.8 

bri (Branch Indirect Unconditionally) 
instruction definition, 10.1 

brI (Branch Indirect Unconditionally) 
instruction timing, 1 0.3 

BS (bus or parity error trap in supervisory mode) 
epsr format description, 2.2.4 

BSR (boundary scan register) 
test, 6.2 

bte (Branch If Equal) 
instruction definition, 10.1 
Instruction timing, 10.3 

btne (Branch If Not Equal) 
instruction timing, 10.3 

buffer 

models, 9.4 

size, selection with PEN#, 4.2.35, 5.5, 9.4.3 

burst cycles 
bus cycle, 5.1.2 

bus arbitration 

bus operation, 5.2 


bus and cache control unit 
function of, 1 .0 

bus cycles 

back-off and restart, 5.2.2 
bus operation, 5.1 
type output pins, 4.1 

bus errors 

bear (bus error address register), 2.2.10 
trap, 2.8.7 

bus operation 

I860 XP microprocessor, 5.0 
BW (break write) 

debugging i860 XP microprocessor, 2.9 
psr format description, 2.2.3 

BYPASS# (bypass) 
signal description, 4.2.9 
TAP encoding, 6.3 

C 

CACHE# (cacheability) 

BE7#~BE0# (byte enables), 4.2.4 
signal description, 4.2.10 

cache 

address translation, 3.1 
consistency protocol, 3.2.4 
external secondary, 1 .0 
inquiry cycles (snooping), 5.3 
Internal instruction and data, 3.2 
invalidating entries, 3.3 
on-chip, 3.0 

replacement algorithm, 3.2.3 
cacheability 

address translation caches, 3.1 
consistency, 3.3.4 

calli (Indirect Subroutine Call) 

Instruction definition, 10.1 
Instruction timing, 10.3 

call (Subroutine Call) 

instruction definition, 10.1 
Instruction timing, 10.3 

capture-DR 

test state, 6.4.5 
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capture-IR 

test state. 6.4.11 

CC (condition code) 

psr format description, 2.2.3 

ccr (concurrency control register) 

DCCU initialization, 2.5.1 
format description, 2.2.12 

CCUBASE 

ccr (concurrency control register), 2.2.12 
DCCU addressing, 2.5.2 
DCCU initialization, 2.5.1 

CD (cache disable) 

bypassing instruction and data cache, 3.3 
page-table entries (PTEs), 2.4.4.5 

CLK (clock) 

signal description, 4.2.11 

CO (CCU on) 

ccr (concurrency control register), 2.2.12 

color intensity shading 
pixel formats, 2.1.4 

compatibility 

pipelined cycles, 5.1.3 
software changes, 10.5.1 

concurrency control unit (CCU) 

ccr (concurrency control register), 2.2.12 
detached CCU, 2.5 
NEWCURR register, 2.2.13 

consistency 

address space, 3.3.1 
cacheability, 3.3.4 
instruction cache, 3.3.2 
internal cache, 3.3 
load pipe, 3.3.5 
page table, 3.3.3 
protocol, 3.2.4 
write-once policy, 3.2.4.2 

control registers 
register set, 2.2 


copy-back policy 

data cache update, 3.2.1. 1 

core execution unit 
function of, 1 .0 

CSS (code size 8-bit) 

BE7#-BE0# (byte enables), 4.2.4 
dirbase format description, 2.2.6 

CTRL-format 

instructions, 10.2.2 . 

CTYP (cycle type) 

signal description, 4.2.12 

current mode 

high vs. normal, 4.2.35, 5.5, 9.3, 9.4.3 

cycles 

back-off, 5.2.2. 1 
burst cycles, 5.1.2 
interrupt acknowledge, 5.1 .4 
pipelined, 5.1.3 
restart, 5.2.2.2 
special bus, 5.1.5 

D 

D63-D0 (data pins) 

signal description, 4.2.14 

data access 
fault, 2.8.5 

data cache 
bypassing, 3.3 
flushing, 3.3 
function of, 1 .0 
operation, 3.2 
organization, 3.2.1 
states, 3.2.4. 1 
update policies, 3.2.1. 1 

data types 

i860 XP microprocessor, 2.1 

DAT (data access trap) 

debugging i860 XP microprocessor, 2.9 
psr format description, 2.2.3 
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db (data breakpoint register) 

debugging i860 XP microprocessor, 2.9 

format description, 2.2.5 

psr BR (break read) and BW (break write), 2.2.3 

Dbit 

dual-instruction mode, 2.6.2 

D/C# (data/code) 

signal description, 4.2.13 

D.C. characteristics 
electrical data, 9.2 

DCCU (detached concurrency control unit) 
addressing, 2.5.2 

ccr (concurrency control register), 2.2.12 
function of, 1 .0 
initialization, 2.5.1 
internals, 2.5.3 

DCS (data cache size) 

epsr format description, 2.2.4 

D (dirty) 

page-table entries (PTEs), 2.4.4.6 
debugging 

i860 XP microprocessor, 2.9 

deferred-write policy 

data cache update, 3.2. 1.1 

denormal 

special floating-point values, 2.1.3 
Detached 

ST AT register description, 2.2.14 

detached CCU 

i860 XP microprocessor, 2.5 

d.fnop 

dual-instruction mode, 2.6.2 

DID (device identification register) 
test, 6.2 

DIR 

virtual address, 2.4.2 


dirbase (directory base register) 
address space consistency, 3.3.1 
cache replacement algorithm, 3.2.3 
DCCU initialization, 2.5.1 
format description, 2.2.6 
instruction cache consistency, 3.3.2 
page directory, 2.4.3 
page table consistency, 3.3.3 
P (present) bit, 2.4.4.2 

disassemblers 

big endian mode, 2.3 

Dl (trap on delayed instruction) 
epsr format description, 2.2.4 

DM (dual instruction mode) , 
psr format description, 2.2.3 

DO (detached only) 

ccr (concurrency control register), 2.2.12 

double-precision real 
data type, 2.1.3 

double real value 

floating-point registers, 2.1.3 

double-shift instruction 
psr SC (shift count), 2.2.3 

DP7-DP0 (data parity) 
signal description, 4.2.15 

DPC (data-path control) 

dual-operation instructions, 2.6.3 

DPS (DRAM page size) 

dirbase format description, 2.2.6 

DS (delayed switch) 

psr format description, 2.2.3 

DTB (directory table base) 

dirbase format description, 2.2.6 

dual-instruction mode 
parallellism, 2.6.2 

dual-operation instructions 
floating-point, 2.6.3 
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E 

EADS# 

AHOLD (address hold), 4.2.3 

EADS# (external address status) 
signal description, 4.2.16 

epsr (extended processor status register) 
data cache, 3.2.1 
DCCU internals, 2.6.3 
format description, 2.2.4 
page-table entries (PTEs), 2.4.4.3 

EWBE# (external write buffer empty) 
epsr SO (strong ordering), 2.2.4 
signal description, 4.2.17 

exiti -DR 

test state, 6.4.7 

exiti -I R 

test state, 6.4.13 

exit2-DR 

test state, 6.4.9 

exlt2-IR 

test state, 6.4.15 
EXTEST 

TAP encoding, 6.3 

F 

faddp (Add with Pixel Merge) 
instruction definition, 10.1 
instruction timing, 10.3 

fadd.p (Floating-Point Add) 
instruction definition, 10.1 
instruction timing, 10.3 

faddz (Add with Z Merge) 
instruction definition, 10.1 
Instruction timing, 10.3 

famov.r. (Floating-Point Adder Move) 
instruction definition, 10.1 
instruction timing, 10.3 


fault 

address translation, 2.4.6 
data access, 2.8.5 
floating-point, 2.8.3 
instruction access, 2.8.4 
result exception fault, 2.8.3. 1 
source exception fault, 2.8.3. 1 

fiadd.w (Long-Integer Add) 

Instruction definition, 10.1 
instruction timing, 10.3 

fir (fault instruction register) 

epsr Dl (trap on delayed instruction), 2.2.4 
format description, 2.2.7 

fisub.w (Long-Integer Subtract) 
instruction definition, 10.1 
instruction timing, 10.3 

flx.v (Floating-Point to Integer Conversion) 
instruction definition, 10.1 
Instruction timing, 10.3 

fid.y (Floating-Point Load) 
instruction definition, 10.1 
instruction timing, 10.3 

FLINE# (flush line) 

BOFF# choice, 5.3.5.1 
signal description, 4.2.18 

floating-point 
adder, 1.0 
control unit, 1 .0 
fault, 2.8.3 

instruction encoding, 10.2.3 
multiplier, 1.0 
register file, 2.2.2 

flush (Cache Flush) 

cache replacement algorithm, 3.2.3 
dirbase RB (replacement block), 2.2.6 
flushing data cache, 3.3 
instruction definition, 10.1 
Instruction timing, 10.3 
requirements summary, 3.3.6 
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fmlow.dd (Floating-Point Multiply Low) 
Instruction definition, 10.1 
instruction timing, 10.3 

fmov.r (Floating-Point Reg-Reg Move) 
instruction definition, 10.1 
instruction timing, 10.3 

fmul.p (Floating-Point Multiply) 

Instruction definition, 10.1 
instruction timing, 10.3 

fnop (Floating-Point No Operation) 
instruction definition, 10.1 
instruction timing, 10.3 

form (OR with MERGE Register) 
instruction definition, 10.1 
instruction timing, 10.3 

frcp.p (Floating-Point Reciprocal) 

Instruction definition, 10.1 
instruction timing, 10.3 

frsqr.p (Floating-Point Reciprocal Square Root) 
Instruction definition, 10.1 
instruction timing, 10.3 

fsr (floating-point status register) 
format description, 2.2.8 
pipelining status information, 2.6. 1.2 

fst.y (Floating-Point Store) 
instruction definition, 10.1 
instruction timing, 10.3 

fsub.p (Floating-Point Subtract) 

Instruction definition, 10.1 
Instruction timing, 10.3 

FTE (floating-point trap enable) 
fsr format description, 2.2.8 


FT (floating-point trap) 

psr format description, 2.2.3 

ftrunc.v (Floating-Point to Integer Conversion) 
instruction definition, 10.1 
instruction timing, 10.3 

fxfr (Transfer F-P to Integer Register) 
instruction definition, 10.1 
instruction timing, 10.3 

fzchki (32-Bit Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

fzchks (16-Blt Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

FZ (flush zero) 

fsr format description, 2.2.8 

G 

graphics unit 
function of, 1 .0 

H 

hardware Interface 

i860 XP microprocessor, 4.0 

HIT # (cache inquiry hit) 
signal description, 4.2.19 

HITM# (hit modified line) 

Internal cache consistency, 3.3 
signal description, 4.2.20 

HLDA (bus hold acknowledge) 
signal description, 4.2.21 

HOLD (bus hold) 
bus arbitration, 5.2 
signal description, 4.2.22 
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i860 XP microprocessor 
bus operation, 5.0 
functional description, 1 .0 
hardware interface, 4.0 
instruction set, 8.0 
mechanical data, 7.0 
on-chip caches, 3.0 
programming interface, 2.0 
testability, 6.0 

lAT (instruction access trap) 
psr format description, 2.2.3 

IDCODE 

TAP encoding, 6.3 
IEEE Standard 

for Binary Floating-Point Arithmetic, 1 .0 
P1 149.1 /D6 testability, 6.0 

IL (interlock) 

epsr format description, 2.2.4 

IM (Interrupt mode) 

psr format description, 2.2.3 

indefinite 

special floating-point values, 2.1.3 
inexact result 

result exception fault, 2.8.3.2 

initialization 
at RESET, 5.5 

infinity 

special floating-point values, 2.1.3 
IN (interrupt) 

psr format description, 2.2.3 
InLoop 

STAT register description, 2.2.14 

inquiry cycles 

data cache states, 3.2.4. 1 
for line being cached, 5.3.2.1 
for line being replaced, 5.3.2.2 
snooping, 5.3 
write-back, 5.3.1 


instruction 

access fault, 2.8.4 
characteristics, 1 0.4 
CTRL-format, 10.2.2 
definitions, 10.1 
dual-operation, 2.6.3 
encoding floating-point, 10.2.3 
fault, 2.8.2 

format and encoding, 10.2 
REG-format, 10.2.1 
timing, 10.3 

instruction cache 
bypassing, 3.3 
consistency, 3.3.2 
function of, 1.0 
operation, 3.2 
organization, 3.2.2 

instruction set 

abbreviations, 10.0 
extensions of I860 XR, 2.6 
I860 XP microprocessor; 8.0 

INT/CS8 (interrupt/code-sIze 8-bits) 
signal description, 4.2.24 

integer 

data type, 2.1.1 
register file, 2.2.1 

internal cache 
consistency, 3.3 

Interrupt 

acknowledge cycles, 5.1.4 
i860 XP microprocessor, 2.8 
trap, 2.8.8 

I NT (Interrupt) 

epsr format description, 2.2.4 

intovr (Software Trap on Integer Overflow) 
instruction definition, 10.1 
instruction timing, 10.3 

INT pin 

epsr INT (interrupt), 2.2.4 
psr IM (Interrupt mode), 2.2.3 
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invalidation requirements 
summary, 3.3.6 

INV (invalidate) 

signal description, 4.2.23 

IR (instruction register) 
test, 6.3 

IRP (integer graphics) 

fsr format description, 2.2.8 

ITI (cache and TLB invalidate) 
dirbase format description, 2.2.6 

IT (instruction trap) 

psr format description, 2.2.3 

ixfr (Transfer Integer to F-P Register) 
Instruction definition, 10.1 
Instruction timing, 10.3 


KBO, KB1 (cache block) 
signal description, 4.2.25 

KEN # (cache enable) 

BE7#-BE0# (byte enables), 4.2.4 
bypassing instruction and data cache, 3.3 
DCCU addressing, 2.5.2 
internal instruction and data caches, 3.2 
locked access, 3.2.4.3 
signal description, 4.2.26 

Kl 

special purpose register description, 2.2.9 

KNF (kill next floating-point instruction) 
psr format description, 2.2.3 

KR 

special purpose register description, 2.2.9 

L 

LB (late back-off mode) 

dirbase format description, 2.2.6 

LCC (loop condition code) 
psr CC (condition code), 2.2.3 


Id.c (Load from Control Register) 
fir (fault instruction register), 2.2.7 
Instruction definition, 10.1 
instruction timing, 10.3 

tdint.x (Load Interrupt Vector) 
big endian mode, 2.3 
epsr BE (big endian), 2.2.4 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

Idio.x (Load I/O) 
big endian mode, 2.3 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

Id.l 

flushing data cache, 3.3 

Id.x (Load Integer) 

DCCU internals, 2.5.3 
instruction definition, 10.1 
instruction timing, 10.3 

LEN (data length) 

signal description, 4.2.27 

LFBSR (linear feedback shift register) 
cache replacement algorithm, 3.2.3 

little endian mode 
addressing, 2.3 

load pipe 

consistency, 3.3.5 

LOCK# (address lock) 

A (accessed) bit, 2.4.4.6 
cycle attribute, 5.4 
dirbase BL (bus lock), 2.2.6 
signal description, 4.2.28 

lock (Begin Interlocked Sequence) 
dirbase BL (bus lock), 2.2.6 
instruction definition, 10.1 
instruction timing, 10.3 
locked access, 3.2.4.3 
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locked access 

cache consistency, 3.2A.3 

lock instruction 

epsr IL (interlock), 2.2.4 

lock protocol 

instruction fault, 2.8.2. 1 

LRPO (load pipe result precision) 
fsr format description, 2.2.8 

LRP1 (load pipe result precision) 
fsr format description, 2.2.8 

M 

MA 

fsr U-bit (update bit), 2.2.8 

mechanical data 

i860 XP microprocessor, 7.0 

MERGE 

special purpose register description, 2.2.9 
MESI 

cache consistency protocol, 3.2.4 
write cycle reordering, 5.3.3 

Ml 

fsr U-bit (update bit), 2.2.8 

M/IO# (memory-l/0) 
signal description, 4.2.29 

MO 

fsr U-bit (update bit), 2.2.8 

mov (Constant-to-Register Move) 
instruction definition, 10.1 

mov (Register-Register Move) 
instruction definition, 10.1 
instruction timing, 10.3 

MU 

fsr U-bit (update bit), 2.2.8 


N 

NA# (next address request) 
locked access, 3.2.4.3 
signal description, 4.2.30 
write-once policy, 3.2.4.2 

NaN (Not a Number) 

special floating-point values, 2.1.3 

NENE# (next near) 

dirbase DPS (DRAM page size), 2.2.6 
signal description, 4.2.31 

Nested 

ST AT register description, 2.2.14 

NEWCURR register 
DCCU internals. 2.5.3 
format description, 2.2.13 

nonpipelined cycle 
bus cycle, 5.1.3 

nop (Core-Unit No Operation) 
instruction definition, 10.1 
instruction timing, 10.3 


offset 

addressing modes, 2.7 
virtual address, 2.4.2 

OF (overflow flag) 

epsr format description, 2.2.4 

on-chip caches 

i860 XP microprocessor, 3.0 

ordinal 

data type, 2.1.2 

orh (Logical OR High) 
instruction definition, 10.-1 
instruction timing, 10.3 

or (Logical OR) 

Instruction definition, 10.1 
instruction timing, 10.3 
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output pins 

pins overview, 4.1 

overflow 

result exception fault, 2. 8.3.2 


package 

thermal specifications, 8.0 
PAGE 

virtual address, 2.4.2 

page directory 

little endian mode, 2.3 
page tables, 2.4.3 

paged virtual-address space 
addressing, 2.3 

page frame 

address, 2.4.4.1 
physical main memory, 2.4.1 

page table 

combining protection, 2.4.4.8 
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PBM (page-table bit mode) 
epsr format description, 2.2.4 
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CD (cache disable), 2.4.4. 5 
signal description, 4.2.32 

PCHK# (parity check) 
signal description, 4.2.33 
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software compatibility, 10.5.2 

pfaddp (Pipelined Add with Pixel Merge) 
Instruction definition, 10.1 
instruction timing, 10.3 

pfadd.p (Pipelined Floating-Point Add) 
instruction definition, 10.1 
instruction timing, 10.3 

pfaddz (Pipelined Add with Z Merge) 

Instruction definition, 10.1 
instruction timing, 10.3 

pfamov.r (Pipelined Floating-Point Adder Move) 
instruction definition, 10J 
instruction timing, 10.3 

pfam.p (Pipelined Floating-Point Add and Multiply) 
dual-operation, 2.6.3 
instruction definition, 10.1 
instruction timing, 1 0.3 
special purpose registers, 2.2.9 

pfeq.p (Pipelined Floating-Point Equal Compare) 
instruction definition, 10.1 
Instruction timing, 10.3. 
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pfgt.p (Pipelined Floating-Point Greater-Than 
Compare) 

instruction definition, 10.1 
instruction timing, 10.3 

pfiadd.w (Pipelined Long-Integer Add) 
instruction definition, 10.1 
instruction timing, 10.3 

pfisub.w (Pipelined Long-Integer Subtract) 
Instruction definition, 10.1 
instruction timing, 10.3 

pfix.v (Pipelined Floating-Point to Integer 
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instruction definition, 10.1 
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pfiCi (ripelineu Fluciiiiiy-roini Load) 
epsr PT (trap on pipeline use), 2.2.4 
load pipe consistency, 3.3.5 
pipeline loads, 2.6. 1.5 

pfid.q 

extensions of i860 XR, 2.6 

pfid. y (Pipelined Floating-Point Load) 

Instruction definition, 10.1 
instruction timing, 10.3 

pfie. p (Pipelined F-P Less-Than or Equal Compare) 
Instruction definition, 10.1 

instruction timing, 10.3 
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dual operation, 2.6.3 
Instruction definition, 10.1 . 
instruction timing, 10.3 
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pfmov.r (Pipelined Floating-Point Reg-Reg Move) 
instruction definition, 10.1 
instruction timing, 10.3 

pfmsm.p (Pipelined Floating-Point Subtract 
and Multiply) 

dual operation, 2.6.3 
instruction definition, 10.1 
Instruction timing, 10.3 
special purpose registers, 2.2.9 


pfmulS.dd (Three-Stage Pipelined Multiply 
instruction definition, 10.1 
Instruction timing, 1 0.3 

pfmul.p (Pipelined Floating-Point Multiply) 
instruction definition, 10.1 
instruction timing, 10.3 

pform (Pipelined OR to MERGE Register) 
Instruction definition, 10.1 
instruction timing, 10.3 

pfsm.p (Pipelined Floating-Point Subtract 
and Multiply) 

dual-operation, 2.6.3 
instruction definition, 10.1 
instruction timing, 10.3 
special purpose leyibit^rb, ^.^.9 

pfsub.p (Pipelined Floating-Point Subtract) 
instruction definition, 10.1 
instruction timing, 10.3 

pftrunc.v (Pipelined Floating-Point to 
Integer Conversion) 

instruction definition, 10.1 
instruction timing, 10.3 

pfzchki (Pipelined 32-Bit Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

pfzchks (Pipelined 16-Blt Z-Buffer Check) 
instruction definition, 10.1 
Instruction timing, 10.3 

physical main memory 
page frame, 2.4.1 

physical tags 

internal Instruction and data caches, 3.2 
PI bit 

using, 2.8.2.2 

PIM (previous interrupt mode) 
psr format description, 2.2.3 

pins overview 

hardware interface, 4.1 
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pipeline 

cycles, 5.1.3 
loads, 2.6.1. 5 
operations, 2.6.1 
precision in, 2.6.1. 3 
scalar transition, 2.6. 1.4 
status information, 2.6. 1.2 

PI (pipeline instruction) 

epsr format description, 2.2.4 

pixel 

data type, 2.1.4 

PM (pixel mask) 

psr format description, 2.2.3 

P (present) 

page-table entries (PTEs), 2.4.4.2 

privileged registers 

format description, 2.2.11 

processor 

revisions, 2.2.4 
type, 2.2.4 

programming interface 

I860 XP microprocessor, 2.0 

PS (pixel size) 

psr format description, 2.2.3 

psr (processor status register) 

debugging i860 XP microprocessor, 2.9 
format description, 2.2.3 
page-table entries (PTEs), 2.4.4.3 

pst.d (Pixel Store) 

instruction definition, 10.1 
instruction timing, 10.3 

psr PS (pixel size) and PM (pixel mask), 2.2.3 

PT (trap on pipeline use) 

epsr format description, 2.2.4 
using, 2.8.2.2 

PU (previous user mode) 
psr format description, 2.2.3 


PWT (page write-through) 
signal description, 4.2.36 
WT (write-through), 2.4.4.4 

R 

ratings 

absolute maximum, 9.1 

RB (replacement block) 

dirbase format description, 2.2.6 

RC (replacement control) 

dirbase format description, 2.2.6 

REG-format 

instructions, 10.2.1 

register cell ordering 
boundary scan, 6.5 

replacement algorithm 
cache, 3.2.3 

RESET (system reset) 

AHOLD (address hold), 4.2.3 

bear (bus error address register), 2.2.10 

cache replacement algorithm, 3.2.3 

epsr BEF (bus error flag), 2.2.4 

epsr SO (strong ordering), 2.2.4 

initialization, 5.5 

signal description, 4.2.37 

trap, 2.8.9 

restart 

bus cycle, 5.2.2 

result exception fault 
floating-point, 2.8.3. 1 

right-shift instruction 

psr SC (shift count), 2.2.3 

RM (rounding mode) 

fsr format description, 2.2.8 

RR (result register) 

fsr format description, 2.2.8 

run-test/ldle 
test state, 6.4.2 
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S 

SAMPLE 

TAP encoding, 6.3 
scalar 

mode, 2.6.1. 1 
operations, 2.6.1 
pipelined transition, 2.6. 1.4 

SC (shift count) 

psr format description, 2.2.3 


shr (Shift Right) 

instruction definition, 10.1 
instruction timing, 1 0.3 

signal description 

hardware interface, 4.2 

single-precision real 
data type, 2.1.3 

single-transfer cycle 
bus cycle, 5.1.1 


scyc.x (Special Cycles) 
big endian mode, 2.3 
epsr BE (big endian), 2.2.4 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 


select-DR-scan 
test state, 6.4.3 

select-1 R-scan 
test state, 6.4.4 

serializing 

locked access, 3.2.4.3 

SE (source exception) 

fsr format description, 2.2.8 

shift-DR 

test state, 6.4.6 

shift-IR 

test state, 6.4.12 

shl (Shift Left) 

instruction definition, 10.1 
instruction timing, 10.3 

shra (Shift Right Arithmetic) 
instruction definition, 10.1 
instruction timing, 10.3 

shrd (Shift Right Double) 
instruction definition, 10.1 
instruction timing, 10.3 


SI (sticky inexact) 

fsr format description, 2.2.8 

snooping 

inquiry cycles, 5.3 

internai instruction and data caches, 3.2 
responsibility limits, 5.3.2 

software compatibility 
required changes, 10.5.1 

SO (strong ordering) 

epsr format description, 2.2.4 

source exception fault 
floating-point, 2.8.3. 1 

spare 

signal description, 4.2.38 

special bus 
cycles, 5.1.5 

special-purpose registers 
register set, 2.2 

special values 

floating-point numbers, 2.1 .3 

ST AT register 

DCCU Internals, 2.5.3 
format description, 2.2.14 
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st.c (Store to Control Register) 
address translation, 2.4 
dirbase BL (bus lock), 2.2.6 
dirbase CSS (code size 8-bit), 2.2.6 
fsr U-bit (update bit), 2.2.8 
instruction definition, 10.1 
instruction tinning, 10.3 
privileged registers, 2.2.11 

stepping number 

epsr format description, 2.2.4 

stio.x (Store I/O) 

big endian mode, 2.3 
epsr BE (big endian), 2.2.4 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

strong ordering mode 
inquiry cycle, 5.3.4 

st.x (Store Integer) 

DCCU internals, 2.5.3 
instruction definition, 10.1 
Instruction timing, 10.3 

subs (Subtract Signed) 

epsr OF (overflow flag), 2.2.4 
Instruction definition, 10.1 
instruction timing, 10.3 

subu (Subtract Unsigned) 
epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
instruction timing, 10.3 

supervisor/user mode 
addressing, 2.3 

ccr (concurrency control register), 2.2.12 
psr U (user mode), 2.2.3 


T 

special purpose register description, 2.2.9 
tags 

internal Instruction and data caches, 3.2 


TAI (Trap On Autoincrement) 
epsr format description, 2.2.4 
fsr U-bit (update bit), 2.2.8 

TAP (test access port) 
controller, 6.4 
controller initialization, 6.6 
testability, 6.0 

TCK (test clock) 

signal description, 4.2.39 

TDI (test data input) 

signal description, 4.2.40 

TOO (test data output) 
signal description, 4.2.41 

test 

architecture, 6.1 
data registers, 6.2 

testability 

i860 XP microprocessor, 6.0 

test-logic-reset 
test state, 6.4.1 

test state 

capture-DR, 6.4.5 
capture-IR, 6.4.11 
exitl-DR, 6.4.7 
exItl-IR, 6.4.13 
exit2-DR, 6.4.9 
exit2-IR, 6.4.15 
pause-DR, 6.4.8 
pause-IR, 6.4.14 
run-test/idle, 6.4.2 
select-DR-scan, 6.4.3 
select-IR-scan, 6.4.4 
shift-DR, 6.4.6 
shift-IR, 6.4.12 
test-logic-reset, 6.4.1 
update-DR, 6.4.10 
update-IR, 6.4.16 

thermal specifications 
package, 8.0 
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Tl (trap inexact) update-IR 

fsr format description, 2.2.8 test state, 6.4.16 


TLB 

address translation caches, 3.1 
DCCU addressing, 2.5.2 
internal cache consistency, 3.3 

TMS (test mode select) 
signal description, 4.2.42 

trap handler 
invocation, 2.8.1 
page tables, 2.4.4.7 

trap (Software Trap) 
bus error, 2.8.7 
i860 XP microprocessor, 2.8 
iriStruCtion CaCne coi iSiSlt/nuy, o.o.^ 
instruction definition, 10.1 
instruction timing, 1 0.3 
interrupt, 2.8.8 
parity error, 2.8.6 
RESET, 2.8.9 

tri-state 

output pins, 4.1 

TRST # (test reset) 

signal description, 4.2.43 

U 

U-bit (update bit) 

fsr format description, 2.2.8 

underflow 

result exception fault, 2.8.3.2 

unlock (End Interlocked Sequence) 
dirbase BL (bus lock), 2.2.6 
epsr IL (interlock), 2.2.4 
instruction definition, 10.1 
Instruction timing, 10.3 

update-DR 

test state, 6.4.10 


user/supervisor mode 

ccr (concurrency control register), 2.2.12 
psr U (user mode), 2.2.3 

U (user) 

page-table entries (PTEs), 2.4.4.3 
psr format description, 2.2.3 

V 

VccCLK (clock power) 
signal description, 4.2.45 

Vcc (system ground) 
signal description, 4.2.44 

virtual address 

address translation caches, 3.1 
CCUBASE, 2.2.12 
format description, 2.4.2 
i860 XP microprocessor, 2.4 

virtual tag 

instruction cache, 3.2.2 

internal Instruction and data caches, 3.2 

Vss (ground) 

signal description, 4.2.44 

W 

wait state 

single-transfer cycle, 5.1.1 

WB/WT # (write-back/ write-through) 
signal description, 4.2.46 
write-once policy, 3.2.4.2 

WP (write protect) 

epsr format description, 2.2.4 
page-table entries (PTEs), 2.4.4.3 

W/R# (write/read) 

signal description, 4.2.47 
write-once policy, 3.2.4.2 
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write-back 

data cache update policy, 3.2. 1.1 
with FLINE#, 5.3.5.2 
inquiry cycles, 5.3.1 
scheduling inquiry cycles, 5.3.5 

write cycle 

reordering due to buffering, 5.3.3 
write-once 

cache consistency, 3.2.4.2 
data cache update policy, 3.2. 1.1 

write-through 

data cache update policy, 3.2.1. 1 

WT (write-through) 

page-table entries (PTEs), 2.4.4.4 
write-through policy, 3.2. 1.1 

W (writable) 

page-table entries (PTEs), 2.4.4.3 


X 

xorh (Logical Exclusive OR High) 
Instruction definition, 10.1 
instruction timing, 10.3 

xor (Logical Exclusive OR) 
instruction definition, 10.1 
instruction timing, 1 0.3 

Z 

Z-buffer 

special purpose registers, 2.2.9 
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■ Parallel Architecture that Supports Up 
to Three Operations per Clock 

— One Integer or Control Instruction 
per Clock 

— Up to Two Floating-Point Results per 
Clock 

■ High Performance Design 

— 25/33.3/40 MHz Clock Rates 

— 80 Peak Single Precision MFLOPs 

— 60 Peak Double Precision MFLOPs 

— 64-Bit External Data Bus 

— 64-Bit Internal Instruction Cache Bus 

— 128-Bit Internal Data Cache Bus 

■ High Level of Integration on One Chip 

— 32-Bit Integer and Control Unit 

— 32/64-Bit Pipelined Floating-Point 
Adder and Multiplier Unite 

— 64-Bit 3-D Graphics Unit 

— Paging Unit with Translation 
Lookaside Buffer 

— 4 Kbyte Instruction Cache 

— 8 Kbyte Data Cache 


■ Compatible with Industry Standards 
— ANSI/IEEE Standard 754-1985 for 

Binary Floating-Point Arithmetic 
— lntel386TM/486TM Microprocessor 
Data Formats and Page Table Entries 
— JEDEC 168-pin Ceramic Pin Grid 
Array Package (see Packaging 
Outlines and Dimensions, order 
#231369) 

■ Easy to Use 

— On-Chip Debug Register 
— Assembler, Linker, Simulator, 
Debugger, C and FORTRAN 
Compilers, FORTRAN Vectorizer, 
Scalar and Vector Math Libraries for 
both OS/2* and UNIX* Environments 


The Intel iSeO^M XR Microprocessor (order codes A80860XR-25, A80860XR-33 and A80860XR-40) delivers 
supercomputing performance in a single VLSI component. The 64-bit design of the i860 XR microprocessor 
balances integer, floating point, and graphics performance for applications such as engineering workstations, 
scientific computing, 3-D graphics workstations, and multiuser systems. Its parallel architecture achieves high 
throughput with RISC design techniques, pipelined processing units, wide data paths, large on-chip caches, 
million-transistor design, and fast one-micron CHMOS IV silicon technology. 


A31-A3 D63-D0 CONTROL 



240296-1 


Figure 0.1. Block Diagram 


Intel, intel, lntel386TM, lntel486TM, i860 XR, Multibus 11 and Parallel System Bus are trademarks of Intel Corporation. 

*UNIX is a registered trademark of UNIX System Laboratories, Inc. OS/2 is a trademark of International Business Machines 
Corporation. 
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1.0 FUNCTIONAL DESCRIPTION 

As shown by the block diagram on the front page, 
the i860 XR microprocessor consists of 9 units: 

1 . Core Execution Unit 

2. Floating-Point Control Unit 

3. Floating-Point Adder Unit 

4. Floating-Point Multiplier Unit 

5. Graphics Unit 

6. Paging Unit 

7. Instruction Cache 

8. Data Cache 

9. Bus and Cache Control Unit 

The core execution unit controls overall operation of 
the i860 XR microprocessor. The core unit executes 
load, store, integer, bit, and control-transfer opera- 
tions, and fetches instructions for the floating-point 
unit as well. A set of 32 x 32-bit general-purpose 
registers are provided for the manipulation of integer 
data. Load and store inctructicnc move S-, 1S-, and 
32-bit data to and from these registers. Its full set of 
integer, logical, and control-transfer instructions give 
the core unit the ability to execute complete systems 
software and applications programs. A trap mecha- 
nism provides rapid response to exceptions and ex- 
ternal interrupts. Debugging is supported by the abili- 
ty to trap on data or instruction reference. 

The floating-point hardware is connected to a sepa- 
rate set of floating-point registers, which can be 
accessed as 1 6 x 64-bit registers, or 32 x 32-bit reg- 
isters. Special load and store instructions can also 
access these same registers as 8 x 1 28-bit registers. 
All floating-point instructions use these registers as 
their source and destination operands. 

The floating-point control unit controls both the float- 
ing-point adder and the floating-point multiplier, issu- 
ing instructions, handling all source and result 
exceptions, and updating status bits in the floating- 
point status register. The adder and multiplier can 
operate in parallel, producing up to two results per 
clock. The floating-point data types, floating-point in- 
structions, and exception handling all support the 
IEEE Standard for Binary Floating-Point Arithmetic 
(ANSI/IEEE Std 754-1985). 

The floating-point adder performs addition, subtrac- 
tion, comparison, and conversions on 64- and 32-bit 
floating-point values. An adder Instruction executes 
In three clocks; however, in pipelined mode, a new 
result Is generated every clock. 

The floating-point multiplier performs floating-point 
and integer multiply and floating-point reciprocal op- 
erations on 64- and 32-bit floating-point values. A 
multiplier instruction executes in three to four clocks; 


however, in pipelined mode, a new result can be 
generated every clock for single-precision and every 
other clock for double precision. 

The graphics unit has special integer logic that sup- 
ports three-dimensional drawing in a graphics frame 
buffer, with color Intensity shading and hidden sur- 
face elimination via the Z-buffer algorithm. The 
graphics unit recognizes the pixel as an 8-, 16-, or 
32-bit data type. It can compute individual red, blue, 
and green color intensity values within a pixel; but it 
does so with parallel operations that take advantage 
of the 64-bit internal word size and 64-blt external 
bus. The graphics features of the i860 XR micro- 
processor assume that the surface of a solid object 
is drawn with polygon patches whose shapes ap- 
proximate the original object. The color intensities of 
the vertices of the polygon and their distances from 
the viewer are known, but the distances and intensi- 
ties of the other points must be calculated by inter- 
polation. The graphics Instructions of the i860 XR 
microprocessor diroctly aid suv^u iiueipOianOii. 

The paging unit implements protected, paged, virtual 
memory via a 64-entry, four-way set-associative 
memory called the TLB (Translation Lookaside Buff- 
er). The paging unit uses the TLB to perform the 
translation of logical address to physical address, 
and to check for access violations. The access pro- 
tection scheme employs two levels of privilege: user 
and supervisor. 

The instruction cache is a two-way set-associative 
memory of four Kbytes, with 32-byte blocks. It trans- 
fers up to 64 bits per clock (320 Mbyte/sec at 
40 MHz). 

The data cache is a two-way set-associative memo- 
ry of eight Kbytes, with 32-byte blocks. It transfers 
up to 128 bits per clock (640 Mbyte/sec at 40 MHz). 
The i860 XR microprocessor normally uses write- 
back caching, i.e. memory writes update the cache 
(if applicable) without necessarily updating memory 
immediately; however, caching can be inhibited by 
software where necessary. 

The bus and cache control unit performs data and 
instruction accesses for the core unit. It receives cy- 
cle requests and specifications from the core unit, 
performs the data-cache or Instuctlon-cache miss 
processing, controls TLB translation, and provides 
the interface to the external bus. Its pipelined struc- 
ture supports up to three outstanding bus cycles. 


2.0 PROGRAMMING INTERFACE 

The programmer-visible aspects of the architecture 
of the i860 XR microprocessor include data types, 
registers, instructions, and traps. 
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2.1 Data Types 

The i860 XR microprocessor provides operations for 
integer and floating-point data. Integer operations 
are performed on 32-bit operands with some support 
also for 64-bit operands. Load and store instructions 
can reference 8-bit, 16-bit, 32-bit, 64-bit, and 128-bit 
operands. Floating-point operations are performed 
on IEEE-standard 32- and 64-bit formats. Graphics 
oriented instructions operate on arrays of 8-, 1 6-, or 
32-bit pixels. 

2.1.1 INTEGER 

An integer Is a 32-bit signed value in standard two’s 
complement form. A 32-bit integer can represent a 
value in the range -2,147,483,648 (-231) to 
2,147,483,647 ( + 231 - 1). Arithmetic operations on 
8- and 16-blt Integers can be performed by sign-ex- 
tending the 8- or 1 6-bit values to 32 bits, then using 
the 32-bit operations. 

There are also add and subtract instructions that op- 
erate on 64-bit long integers. 

Load and store instructions may also reference (in 
addition to the 32- and 64-bit formats previously 
mentioned) 8- and 16-bit items in memory. When an 
8- or 16-bit item Is loaded into a register, it is con- 
verted to an Integer by sign-extending the value to 
32 bits. When an 8- or 16-blt item is stored from a 
register, the corresponding number of low-order bits 
of the register are used. 


2.1.2 ORDINAL 

Arithmetic operations are available for 32-bit ordi- 
nals. An ordinal is an unsigned integer. An ordinal 
can represent values in the range 0 to 
4,294,967,295 ( + 232 - 1). 

Also, there are add and subtract Instructions that op- 
erate on 64-bit ordinals. 


2.1.3 SINGLE- AND DOUBLE-PRECISION REAL 

Figure 2.1 shows the real number formats. A single- 
precision real (also called “single real”) data type is 
a 32-bit binary floating-point number. Bit 31 Is the 
sign bit; bits 30..23 are the exponent; and bits 22..0 
are the fraction. In accordance with ANSI/IEEE 
standard 754, the value of a single-precision real Is 
defined as follows: 

1. If e = 0 and f += 0 or e = 255 then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 

2. If 0 < e < 255, then the value Is (-l)s x l.f x 
26 - 127 . 

3. If e = 0 and f = 0, then the value is signed zero. 

A double-precision real (also called “double real”) 
data type Is a 64-bit binary floating-point number. Bit 
63 is the sign bit; bits 62.. 52 are the exjDonent; and 
bits 51. .0 are the fraction. In accordance with ANSI/ 
IEEE standard 754, the value of a double-precision 
real is defined as follows: 

1 . If e = 0 and f += 0 or e = 2047, then generate a 
floating-point source-exception trap vyhen en- 
countered In a floating-point operation. 

2. If 0 < e < 2047, then the value is (-l)s x l.f x 
26 - 1023 . 



Figure 2.1. Real Number Formats 
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3. If e = 0 and f = 0, then the value Is signed zero. 

The special values Infinity, NaN (“Not a Number’’), 
indefinite, and denormal generate a trap when en- 
countered. The trap handler implements IEEE-stan- 
dard results. 

A double real value occupies an even/odd pair of 
floating-point registers. Bits 31. .0 are stored in the 
even-numbered floating-point register; bits 63.. 32 
are stored in the next higher odd-numbered floating- 
point register. 


2.1.4 PIXEL 

A pixel may be 8, 16, or 32 bits long depending on 
color and intensity resolution requirements. Regard- 
less of the pixel size, the i860 XR microprocessor 
always operates on 64 bits worth of pixels at a time. 
The pixel data type is used by two kinds of instruc- 

tinns* 

• The selective pixel-store instruction that helps im- 
plement hidden surface elimination. 

• The pixel add instruction that helps implement 
3-D color intensity shading. 

To perform color intensity shading efficiently in a va- 
riety of applications, the i860 XR microprocessor de- 
fines three pixel formats according to Table 2.1. 

Figure 2.2 illustrates one way of assigning meaning 
to the fields of pixels. These assignments are for 
illustration purposes only. The i860 XR microproces- 
sor defines only the field sizes, not the specific use 
of each field. Other ways of using the fields of pixels 
are possible. 


Table 2.1. Pixel Formats 


Pixei 
Size 
(in bits) 

Bits of 
Coior 1 
Intensity 

Bits of 
Color 2 
Intensity 

Bits of 
Color 3 
Intensity 

Bits of 
Other 
Attribute 
(Texture) 

8 

N 8) bits of intensity* 

8 - N 

16 

6 

6 

4 


32 

8 

8 

8 

8 


The intensity attribute fields may be assigned to colors in 
any order convenient to the application. 


*With 8-bit pixels, up to 8 bits can be used for intensity; the 
remaining bits can be used for any other attribute, such as 
color. The intensity bits must be the low-order bits of the 
pixel. 

2.2 Register Set 

As Figure 2.3 shows, the i860 XR microprocessor 
nas tne roiiowing registers: 

• An integer register file 

® A floating-point register file 

• Six control registers (psr, epsr, db, dirbase, fir, 
and fsr) 

• Four special-purpose registers (KR, Kl, T, and 
MERGE) 

The control registers are accessible only by load 
and store control-register instructions; the Integer 
and floating-point registers are accessed by arithme- 
tic operations and load and store instructions. The 
special-purpose registers KR, Kl, T, and MERGE are 
used by a few specific Instructions. 


7 5 0 
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l“lntensity, R — Red intensity, G — Green intensity, B — Blue intensity, C — Color, T — Texture 

These assignments of specific meanings to the fields of pixels are for illustration purposes only. Only the field sizes are 
defined, not the specific use of each field. 


Figure 2.2. Pixel Format Exampie 


2-170 










iSeOTM XR MICROPROCESSOR 




inl^. 


2.2.1 INTEGER REGISTER FILE 

There are 32 integer registers, each 32 bits wide, 
referred to as rO through r31, which are used for 
address computation and scalar integer computa- 
tions. Register rO always returns zero when read, 
independently of what is stored in it. 

2.2.2 FLOATING-POINT REGISTER FILE 

There are 32 floating-point registers, each 32-bits 
wide, referred to as fO through f31, which are used 
for floating-point computations. Registers fO and f 1 
always return zero when read, independently of 
what is stored in them. The floating-point registers 
are also used by a set of graphics operations, pri- 
marily for 3D graphics computations. 

When accessing 64-bit floating-point or integer val- 
ues, the i860 XR microprocessor uses an even/odd 
pair of registers. When accessing 128-bit values, it 
uses an aligned set of four registers (fO, f4, f8, . . . , 
f28). The instruction must designate the lowest reg- 
ister number of the set of registers containing 64- or 
128-blt values. Misaligned register numbers produce 
undefined results. The register with the lowest num- 
ber contains the least significant part of the value. 
For 128-bit values, the register pair with the lower 
numbers contain the least significant 64 bits while 
the register pair with the higher numbers contain the 
most significant 64 bits. 


The 128-bit load and store instructions, along with 
the 1 28-bit data path between the floating-point reg- 
isters and the data cache help to sustain the extraor- 
dinarily high rate of computation. 


2.2.3 PROCESSOR STATUS REGISTER 


The processor status register (psr) contains miscel- 
laneous state information for the current process. 
Figure 2.4 shows the format of the psr. 

• BR (Break Read) and BW (Break Write) enable a 
data access trap when the operand address 
matches the address in the db register and a 
read or write (respectively) occurs. 


• Various instructions set CC (Condition Code) ac- 
cording to tests they perform. The branch-oh- 
condition-code Instructions test Its value. The bla 
instruction sets and tests LCC (Loop Condition 
Code). 



• IM (Interrupt Mode) enables external interrupts if 
set; disables Interrupts if clear. 


• U (User Mode) is set when the I860 XR micro- 
processor is executing In user mode; it Is clear 
when the i860 XR microprocessor is executing in 
supervisor mode. In user mode, writes to some 
control registers are Inhibited. This bit also con- 
trols the memory protection mechanism. See 
section 2.4.4.3 for a description of memory pro- 
tection in user and supervisor modes. 
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Figure 2.3. Registers and Data Paths 
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BREAK READ 

BREAK WRITE 

CONDITION CODE 

LOOP CONDITION CODE 

INTERRUPT MODE 

PREVIOUS INTERRUPT MODE 

USER MODE 

PREVIOUS USER MODE 

INSTRUCTION TRAP 

INTERRUPT 

INSTRUCTION ACCESS TRAP 

DATA ACCESS TRAP 

FLOATING-POINT TRAP 

DELAYED SWITCH 

DUAL INSTRUCTION MODE - 


23 


17 15 


PM 

PS 

SC 

X 

K 

N 

F 

D 

I 
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D 

S 

F 

T 

D 

A 

T 

I 

A 

T 

I 

N 

I 

T 

P 

U 

U 

P 

I 

M 

I 

M 

L 

C 

C 

c 

c 

B 

W 

I 



KILL NEXT FLOATING-POINT INSTRUCTION 
(RESERVED) 

SHIFT COUNT 
PIXEL SIZE 
PIXEL MASK 


*Can be changed only from supervisor level. 



Figure 2.4 Processor Status Register 


INTERLOCK 

WRITE-PROTECT MODE 
DATA CACHE SIZE — 


31 24 22 18 15 13 8 


(RESERVED) 


0 

F 


P 

B 

M 


DCS 



STEPPING 

NUMBER 


PROCESSOR 

TYPE 


0 


(RESERVED) 
PAGE-TABLE BIT MODE 
BIG ENDIAN MODE 
OVERFLOW FLAG 


*Can be changed only from supervisor level. 
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Figure 2.5 Extended Processor Status Register 


® PIM (Previous Interrupt Mode) and PU (Previous 
User Mode) save the corresponding status bits 
(IM and U) on a trap, because those status bits 
are changed when a trap occurs. They are re- 
stored into their corresponding status bits when 
returning from a trap handler with a branch indi- 
rect instruction when a trap flag is set in the psr. 

® FT (Floating-Point Trap), DAT (Data Access 
Trap), lAT (Instruction Access Trap), IN (Inter- 
rupt), and IT (Instruction Trap) are trap flags. 
They are set when the corresponding trap condi- 
tion occurs. The trap handler examines these bits 
to determine which condition or conditions have 
caused the trap. 


® DS (Delayed Switch) is set if a trap occurs during 
the instruction before dual-instruction mode is en- 
tered or exited. If DS is set and DIM (Dual Instruc- 
tion Mode) is clear, the i860 XR microprocessor 
switches to dual-instruction mode one instruction 
after returning from the trap handler. If DS and 
DIM are both set, the i860 XR microprocessor 
switches to single-instruction mode one instruc- 
tion after returning from the trap handler. 

• When a trap occurs, the i860 XR microprocessor 
sets DIM if it is executing In dual-instruction 
mode; it clears DIM if It is executing In single-in- 
struction mode. If DIM Is set after returning from a 
trap handler, the i860 XR microprocessor re- 
sumes execution In dual-instruction mode. 
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• When KNF (Kill Next Floating-Point Instruction) is 
set, the next floating-point instruction is sup- 
pressed (except that its dual-instruction mode bit 
is interpreted). A trap handler sets KNF if the 
trapped floating-point instruction should not be 
reexecuted. 

• SC (Shift Count) stores the shift count used by 
the last right-shift instruction. It controls the num- 
ber of shifts executed by the double-shift instruc- 
tion. 

• PS (Pixel Size) and PM (Pixel Mask) are used by 
the pixel-store instruction and by the graphics in- 
structions. The values of PS control pixel size as 
defined by Table 2.2. The bits in PM correspond 
to pixels to be updated by the pixel-store instruc- 
tion pst.d. The low-order bit of PM corresponds 
to the low-order pixel of the 64-bit source oper- 
and of pst.d. The number of low-order bits of PM 
that are actually used is the number of pixels that 
fit into 64-bits, which depends upon PS. If a bit of 
PM Is set, then pst.d stores the corresponding 
pixel. Refer also to the pst.d instruction in section 
8 . 


Table 2.2. Values of PS 


Value 

Pixel Size 
in bits 

Pixel Size 
in bytes 

00 

8 

1 

01 

16 

2 

10 

32 

4 

11 

(undefined) 

(undefined) 


2.2.4 EXTENDED PROCESSOR STATUS 
REGISTER 

The extended processor status register (epsr) con- 
tains additional state Information for the current pro- 
cess beyond that stored In the psr. Figure 2.5 shows 

the format of the epsr. 

• The processor type is one for the i860 XR micro- 
processor. 

• The stepping number has a unique value that dis- 
tinguishes among different revisions of the proc- 
essor. 

• IL (Interlock) is set if a trap occurs after a lock 
instruction but before the load or store following 
the subsequent unlock instruction. IL indicates to 
the trap handler that a locked sequence has 
been interrupted. When the trap handler finds IL 
set, it should scan backwards for the lock in- 
struction and restart at that point. The absence of 
a lock instruction within 30-33 instructions of the 
trap indicates a programming error. 

• WP (write protect) controls the semantics of the 
W bit of page table entries. A clear W bit in either 
the directory or the page table entry causes 
writes to be trapped. When WP Is clear, writes 
are trapped in user mode, but not in supervisor 
mode. When WP is set, writes are trapped in both 
user and supervisor modes. After the value of the 
WP bit is changed, the TLB must be invalidated 
by setting the ITI bit of the dirbase register, be- 
fore any stores are performed. 

• INT (Interrupt) is the value of the INT input pin. 

• DCS (Data Cache Size) is a read-only field that 
tells the size of the on-chip data cache. The num- 
ber of bytes actually available Is 212+ DCS; there- 
fore, a value of zero indicates 4 Kbytes, one indi- 
cates 8 Kbytes, etc. 


ADDRESS TRANSLATION ENABLE 

DRAM PAGE SIZE 

BUS LOCK 

l-CACHE, TLB INVALIDATE 

(RESERVED) 

CODE SIZE 8 -BIT 

REPLACEMENT BLOCK 

REPLACEMENT CONTROL 


31 12 10 8 7 4 1 0 


DIRECTORY TABLE BASE (DTB) 

RC 

RB 

1 

B 

B 

B 

DPS 

1 

* 

♦ 

* 

* 

♦ 

♦ 

* 

♦ 

* 
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*Can be changed only from supervisor level 


Figure 2.6. Directory Base Register 
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• PBM (Page-Table Bit Mode) determines which bit 
of page-table entries is output on the PTB pin. 
When PBM is clear, the PTB signal reflects bit CD 
of the page-table entry used for the current cycle. 
When PBM is set, the PTB signal reflects bit WT 
of the page-table entry used for the current cycle. 

• BE (Big Endian) controls the ordering of bytes 
within a data item in memory. Normally (i.e. when 
BE is clear) the i860 XR microprocessor operates 
in little endian mode, in which the addressed byte 
is the low-order byte. When BE Is set (big endian 
mode), the low-order three bits of all load and 
store addresses are complemented, then 
masked to the appropriate boundary for align- 
ment. This causes the addressed byte to be the 
most significant byte. Section 2.3 discusses little 
and big endian addressing. 

• OF (Overflow Flag) is set by adds, addu, subs, 
and subu when integer overflow occurs. For 
adds and subs, OF is set if the carry from bit 31 
is different than the carry from bit 30. For addu, 
OF is set If there is a carry from bit 31 . For subu, 
OF is set if there Is no carry from bit 31 . Under all 
other conditions, it is cleared by these instruc- 
tions. OF controls the function of the intovr 
instruction. OF cannot be written in user mode 
using ST.C. 


2.2.5 DATA BREAKPOINT REGISTER 

The data breakpoint register (db) is used to gener- 
ate a trap when the i860 XR microprocessor makes 
a data-operand access to the address stored in this 
register. The trap is enabled by BR and BW in psr. 
The db register can only be changed from supervi- 
sor level. When comparing, a number of low order 
bits of the address are ignored, depending on the 
size of the operand. For example, a 16-bit access 
ignores the low-order bit of the address when com- 
paring to db; a 32-bit access ignores the low-order 
two bits. This ensures that any access that overlaps 
the address contained in the register will generate a 
trap. The DAT occurs before the data is accessed 
and prevents the load or store from completing. 


2.2.6 DIRECTORY BASE REGISTER 

The directory base register dirbase (shown in Figure 
2.6) controls address translation, caching, and bus 
options. The dirbase register can only be changed 
from supervisor level. The BL bit Is changed from 
user level with the lock and unlock instructions. 

• ATE (Address Translation Enable), when set, en- 
ables the virtual-address translation algorithm. 
The data cache must be flushed before changing 
the ATE bit. 

• DPS (DRAM Page Size) controls how many bits 
to ignore when comparing the current bus-cycle 


address with the previous bus-cycle address to 
generate the NENE# signal. This feature allows 
for higher speeds when using static column or 
page-mode DRAMs and consecutive reads and 
writes access the row. The comparison ignores 
the low-order 12 + DPS bits. A value of zero is 
appropriate for one bank of 256K x n RAMs, 1 
for 1 M X /7 RAMS, etc. For interleaved memory. 
Increase DPS by one for each power of interleav- 
ing — add one for 2-way, and two for 4-way, etc. 


• When BL (Bus Lock) Is set, external bus access- 
es are locked. The LOCK# signal is asserted the 
next bus cycle whose Internal bus request is gen- 
erated after BL is set. It remains set on every 
subsequent bus cycle as long as BL remains set. 
The LOCK# signal Is deasserted on the next 
load or store instruction after BL is cleared. Traps 
immediately clear BL. The lock and unlock 
instructions control the BL bit. The result of modi- 
fying BL with the st.c instruction is not defined. 



• ITI (l-Cache, TLB Invalidate), when set in the val- 
ue that is loaded Into dirbase, causes all entries 
in the instruction cache and address-translation 
cache (TLB) to be Invalidated. The ITI bit does 
not remain set in dirbase. ITI always appears as 
zero when reading dirbase. Section 2.5 discuss- 
es flushing the data cache before invalidating the 
TLB. 


• When CS8 (Code Size 8-Bit) is set, instruction 
cache misses are processed as 8-blt bus cycles. 
When this bit is clear, instruction cache misses 
are processed as 64-bit bus cycles. This bit can 
not be set by software; hardware sets this bit at 
initialization time. It can be cleared by software 
(one time only) to allow the system to execute out 
of 64-bit memory after bootstrapping from 8-bit 
EPROM. A nondelayed branch to code In 64-bit 
memory should directly follow the st.c (store con- 
trol register) instruction that clears CS8, in order 
to make the transition from 8-bit to 64-blt memory 
occur at the correct time. The branch instruction 
must be aligned on a 64-bit boundary. 

• RB (Replacement Block) identifies the cache 
block to be replaced by cache replacement algo- 
rithms. The high-order bit of RB Is ignored by the 
instruction and data caches. RB conditions the 
cache flush instruction flush, which Is discussed 
in Section 8. Table 2.3 explains the values of RB. 

• RC (Replacement Control) controls cache re- 
placement algorithms. Table 2.4 explains the sig- 
nificance of the values of RC. 


• DTB (Directory Table Base) contains the high-or- 
der 20 bits of the physical address of the page 
directory when address translation is enabled (i.e. 
ATE = 1). The low-order 12 bits of the address 
are zeros. 
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FLUSH ZERO 

TRAP INEXACT 

ROUNDING MODE 

UPDATE 

FLOATING-POINT TRAP ENABLE 

(RESERVED) 

STICKY INEXACT FLAG 

SOURCE EXCEPTION 

MULTIPLIER UNDERFLOW 

MULTIPLIER OVERFLOW 

MULTIPLIER INEXACT 

MULTIPLIER ADD ONE 

ADDER UNDERFLOW 

ADDER OVERFLOW 


31 
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Figure 2.7. Floating-Point Status Register 


Table 2.3. Values of RB 


Value 

Replace 
TLB Block 

Replace Instruction 
and Data Cache Block 

0 0 

0 

0 

0 1 

1 

1 

1 0 

2 

0 

1 1 

3 

1 


Table 2.4. Values of RC 


Value 

Meaning 

00 

Selects the normal replacement 
algorithm where any block in the set 
may be replaced on cache misses In all 
caches. 

01 

Instruction, data, and TLB cache 
misses replace the block selected by 

RB. The instruction and data caches 
ignore the high-order bit of RB. This 
mode is used for instruction cache and 
TLB testing. 

10 

Data cache misses replace the block 
selected by the low-order bit of RB. 
Instruction and TLB caches use 
random replacement. 

11 

Disables data cache replacement. 
Instruction and TLB caches use 
random replacement. 


2.2.7 FAULT INSTRUCTION REGISTER 

When a trap occurs, this register contains the ad- 
dress of the trapping instruction (not necessarily the 
instruction that created the conditions that required 
the trap). The fir is a read-only register. In single-in- 
struction mode, using a Id.c instruction to read the 
fir anytime except the first time after a trap saves in 
idest the address of the id.c Instruction; in dual-in- 
struction mode, the address of its floating-point com- 
panion (address of the Id.c - 4) Is saved. 

2.2.8 FLOATING-POINT STATUS REGISTER 

The floating-point status register (fsr) contains the 
floating-point trap and rounding-mode status for the 
current process. Figure 2.7 shows Its format. The fsr 
is writable In user level. 

• If FZ (Flush Zero) is clear and underflow occurs, 
a result-exception trap Is generated. When FZ is 
set and underflow occurs, the result Is set to zero, 
and no trap due to underflow occurs. 

• If Tl (Trap Inexact) is clear, inexact results do not 
cause a trap. If Tl Is set, inexact results cause a 
trap. The sticky inexact flag (SI) is set whenever 
an inexact result is produced, regardless of the 
setting of Tl. 

• RM (Rounding Mode) specifies one of the four 
rounding modes defined by the IEEE standard. 
Given a true result b that cannot be represented 
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Table 2.5. Values of RM 


Value 

Rounding Mode 

Rounding Action 

00 

Round to nearest or even 

Closer to ib of a or c; if equally 
close, select even number 
(the one whose least 
significant bit Is zero). 

01 

Round down (toward - oo) 

a 

10 

Round up (toward + oo 

c 

11 

Chop (toward zero) 

Smaller in magnitude of a or c. 


by the target data type, the i860 XR microproces- 
sor determines the two representable numbers a 
and c that most closely bracket b in value (a < b 
< c). The I860 XR microprocessor then rounds 
(changes) d to a or c according to the mode se- 
lected by RM as defined in Table 2.5. Rounding 
introduces an error in the result that is less than 
one least-significant bit. 

• The U-bit (Update Bit), If set in the value that is 
loaded into fsr by a st.c instruction, enables up- 
dating of the result-status bits (AE, AA, Al, AO, 
AU, MA, Ml, MO, and MU) In the first-stage of the 
floating-point adder and multiplier pipelines. If this 
bit is clear, the result-status bits are unaffected 
by a st.c Instruction; st.c ignores the correspond- 
ing bits in the value that is being loaded. A st.c 
always updates fsr bits 21 ..17 and 8..0 directly. 
The U-bit does not remain set; it always appears 
as zero when read. 

• The FTE (Floating-Point Trap Enable) bit, If clear, 
disables all floating-point traps (Invalid input oper- 
and, overflow, underflow, and inexact result). 

• SI (Sticky Inexact) is set when the last stage re- 
sult of either the multiplier or adder Is inexact (l.e. 
when either Al or Ml Is set). SI is “sticky” in the 
sense that it remains set until reset by software. 
Al and Ml, on the other hand, can by changed by 
the subsequent floating-point Instruction. 

• SE (Source Exception) is set when one of the 
source operands of a floating-point operation Is 
invalid; It is cleared when all the input operands 
are valid. Invalid input operands include denor- 
mals, Infinities, and all NaNs (both quiet and sig- 
naling). 

• When read from the fsr, the result-status bits MA, 
Ml, MO, and MU (Multiplier Add-One, Inexact, 
Overflow, and Underflow, respectively) describe 
the last stage result of the multiplier. 

When read from the fsr, the result-status bits AA, 
Al, AO, AU, and AE (Adder Add-One, Inexact, 
Overflow, Underflow, and Exponent, respectively) 
describe the last stage result of the adder. The 
high-order three bits of the 1 1 -bit exponent of the 
adder result are stored in the AE field. 

The Adder Add One and Multiplier Add One bits 
Indicate that the absolute value of the result frac- 


tion grew by one least-significant bit due to 
rounding. AA and MA are not influenced by the 
sign of the result. 


After a floating-point operation in a given unit (ad- 
der or multiplier), the result-status bits of that unit 
are undefined until the point at which result ex- 
ceptions are reported. 

When written to the fsr with the U-bit set, the 
result-status bits are placed into the first stage of 
the adder and multiplier pipelines. When the 
processor executes pipelined operations, it prop- 
agates the result-status bits of a particular unit 
(multiplier or adder) one stage for each pipelined 
floating-point operation for that unit. When they 
reach the last stage, they replace the normal re- 
sult-status bits in the fsr. When the U-bIt is not 
set, result-status bits in the word being written to 
the fsr are ignored. 



In a floating-point dual-operation instruction (e.g. 
add-and-multiply or subtract-and-multiply), both 
the multiplier and the adder may set exception 
bits. The result-status bits for a particular unit re- 
main set until the next operation that uses that 
unit. 

• RR (Result Register) specifies which floating- 
point register (f0-f31) was the destination regis- 
ter when a result-exception trap occurs due to a 
scalar operation. 


• LRP (Load Pipe Result Precision), IRP (Integer 
(Graphics) Pipe Result Precision), MRP (Multiplier 
Pipe Result Precision), and ARP (Adder Pipe Re- 
sult Precision) aid in restoring pipeline state after 
a trap or process switch. Each defines the preci- 
sion of the last stage result in the corresponding 
pipeline. One of these bits is set when the result 
in the last stage of the corresponding pipeline is 
double precision; it Is cleared if the result Is single 
precision. These bits cannot be changed by soft- 
ware. 


2.2.9 KR, Kl, T, AND MERGE REGISTERS 

The KR, Kl, and T registers are special-purpose reg- 
isters used by the dual-operation floating-point 
instructions pfam, pfmam, pfsm, and pfmsm. 
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which initiate both an adder (A-unit) operation and a 
multiplier (M-unit) operation. The KR, Kl, and T regis- 
ters can store values from one dual-operation in- 
struction and supply them as inputs to subsequent 
dual-operation instructions. (Refer to Figure 2.14.) 

The MERGE register is used only by the graphics 
instructions. The purpose of the MERGE register is 
to accumulate (or merge) the results of multiple-ad- 
dition operations that use as operands the color-in- 
tensity values from pixels or distance values from a 
Z-buffer. The accumulated results can then be 
stored in one 64-blt operation. 

Two multiple-addition instructions and an OR in- 
struction use the MERGE register. The addition in- 
structions are designed to add interpolation values 
to each color-intensity field In an array of pixels or to 
each distance value in a Z-buffer. 

Refer to the instruction descriptions in section 8 for 
more information about these registers. 


2.3 Addressing 

Memory is addressed in byte units with a paged vir- 
tual-address space of 232 bytes. Data and instruc- 
tions can be located anywhere In this address 
space. Address arithmetic is performed using 32-bit 
input values and produces 32-bit results. The low-or- 
der 32 bits of the result are used in case of overflow. 

Normally, multibyte data values are stored in memo- 
ry In little endian format, i.e., with the least significant 
byte at the lowest memory address. As an option, 
the ordering can be dynamically selected by soft- 
ware in supervisor mode. The i860 XR microproces- 
sor also offers big endian mode, in which the most 
significant byte of a data item is at the lowest ad- 
dress. Figure 2.8 shows the difference between the 
two storage modes. Big endian and little endian data 
areas should not be mixed within a 64-bit data word. 
Illustrations of data structures in this data sheet 
show data stored in little endian mode, i.e., the low- 
order byte is at the lowest memory address. 


Code accesses are always done with little endian 
addressing. This implies that code will appear differ- 
ently than documented here when accessed as big 
endian data. Intel recommends that disassemblers 
running in a big endian system, convert instructions 
which have been read as data back to little endian 
form and present them in the format documented 
here. 

Page directories and page tables are also accessed 
in little endian mode, regardless of the value of the 
BE bit. 

Alignment requirements are as follows (any violation 
results in a data-access trap): 

• ' 128-blt values are aligned on 16-byte boundaries 

when referenced in memory (i.e. the four least 
significant address bits must be zero). 

• 64-bit values are aligned on 8-byte boundaries 
when referenced in memory (i.e. the three least 
significant address bits must be zero). 

• 32-bit values are aligned on 4-byte bouridaries 
when referenced in memory (I.e. the two least 
significant address bits must be zero). 

• 16-bit values are aligned on 2-byte boundaries 
when referenced in memory (i.e. the least signifi- 
cant address bit must be zero). 


2.4 Virtual Addressing 

When address translation is enabled, the i860 XR 
microprocessor maps instruction and data virtual ad- 
dresses Into physical addresses before referencing 
memory. This address transformation is compatible 
with that of the lntel386TM microprocessor and Im- 
plements the basic features needed for page-orient- 
ed virtual-memory systems and page-level protec- 
tion. 

The address translation is optional. Address transla- 
tion is in effect only when the ATE bit of dirbase is 
set. This bit is typically set by the operating system 
during software initialization. The ATE bit must be 
set if the operating system Is to Implement page-orl- 
ented protection or page-oriented virtual memory. 
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Figure 2.9. Format of a Virtual Address 


Address translation is disabled when the processor 
Is reset. It Is enabled when a store to dirbase sets 
the ATE bit. It Is disabled again when a store clears 
the ATE bit. 


2.4.1 PAGE FRAME 

A page frame is a 4-Kbyte unit of contiguous ad- 
dresses of physical main memory. Page frames be- 
gin on 4-Kbyte boundaries and are fixed in size. A 
page is the collection of data that occupies a page 
frame when that data is present in main memory. 
The data may also occupy some location in second- 
ary storage when there is not sufficient space in 
main memory. 

2.4.2 VIRTUAL ADDRESS 

A virtual address refers indirectly to a physical ad- 
dress by specifying a page table, a page within that 


table, and an offset within that page. Figure 2.9 
shows the format of a virtual address. 

Figure 2.10 shows how the i860 XR microprocessor 
converts the DIR, PAGE, and OFFSET fields of a 
virtual address into the physical address by consult- 
ing two levels of page tables. The addressing mech- 
anism uses the DIR field as an index into a page 
directory, uses the PAGE field as an Index into the 
page table determined by the page directory, and 
uses the OFFSET field to address a byte within the 
page determined by the page table. 


2.4.3 PAGE TABLES 

A page table is simply an array of 32-bit page specifi- 
ers. A page table is itself a page, and therefore con- 
tains 4 Kbytes of memory or at most 1 K 32-bit en- 
tries. 
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Figure 2.10. Address Translation 
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Two levels of tables are used to address a page of 
memory. At the higher level is a page directory. The 
page directory addresses up to 1K page tables of 
the second level. A page table of the second level 
addresses up to 1 K pages. All the tables addressed 
by one page directory, therefore, can address 1M 
pages (220). Because each page contains 4 Kbytes 
(212 bytes), the tables of one page directory can 
span the entire physical address space of the I860 
XR microprocessor (220 x 212 = 2^2). 

The physical address of the current page directory is 
stored in DTB field of the dirbase register. Memory 
management software has the option of using one 
page directory for all processes, one page directory 
for each process, or some combination of the two. 


2.4.4 PAGE-TABLE ENTRIES 

Page-table entries (PTEs) in either level of page ta- 
bles have the same format. Figure 2.11 illustrates 
this format. 


2.4.4.1 Page Frame Address 

The page frame address specifies the physical start- 
ing address of a page. Because pages are located 
on 4K boundaries, the low-order 12 bits are always 
zero. In a page directory, the page frame address is 
the address of a page table. In a second-level page 
table, the page frame address is the address of the 
page frame that contains the desired memory oper- 
and. 


2.4.4.2 Present Bit 

The P (present) bit indicates whether a page table 
entry can be used In address translation. P = 1 indi- 


cates that the entry can be used. When P = 0 In 
either level of page tables, the entry is not valid for 
address translation, and the rest of the entry is avail- 
able for software use; none of the other bits in the 
entry is tested by the hardware. If P = 0 in either 
level of page tables when an attempt is made to use 
a page-table entry for address translation, the proc- 
essor signals either a data-access fault or an in- 
struction-access fault. In software systems that sup- 
port paged virtual memory, the trap handler can 
bring the required page Into physical memory. 


Note that there is no P bit for the page directory 
Itself. The page directory may be not-present while 
the associated process Is suspended, but the oper- 
ating system must ensure that the page directory 
indicated by the dirbase Image associated with the 
process is present in physical memory before the 
process is dispatched. 



2.4.4.3 Writable and User Bits 

The W (writable) and U (user) bits are used for page- 
level protection, which the i860 XR microprocessor 
performs at the same time as address translation. 
The concept of privilege for pages is implemented 
by assigning each page to one of two levels; 

1 . Supervisor level (U = 0) — for the operating sys- 
tem and other systems software and related data. 

2. User level (U = 1) — for applications procedures 
and data. 

The U bit of the psr indicates whether the i860 XR 
microprocessor is executing at user or supervisor 
level. The I860 XR microprocessor maintains the U 
bit of psr as follows: 


PRESENT 

WRITABLE 

USER 

WRITE-THROUGH 

CACHE DISABLE 

ACCESSED 

DIRTY 

(RESERVED) 

AVAILABLE FOR SYSTEMS PROGRAM USE 


31 


12 9 7 5 3 0 


PAGE FRAME ADDRESS 31.. 12 


AVAIL 


XXDApl[!uWP 
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NOTE: 

X indicates Intel resented. Do not use. 


Figure 2.11. Format of a Page Table Entry 
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• The i860 XR microprocessor clears the psr U bit 
to indicate supervisor level when a trap occurs 
(including when the trap instruction causes the 
trap). The prior value of U is copied into PU. 

• The i860 XR microprocessor copies the psr PU 
bit into the U bit when an indirect branch is exe- 
cuted and one of the trap bits is set. If PU was 
one, the i860 XR microprocessor enters user 
level. 

With the U bit of psr and the W and U bits of the 
page table entries, the i860 XR microprocessor im- 
plements the following protection rules: 

• When at user level, a read or write of a supervi- 
sor-level page causes a trap. 

• When at user level, a write to a page whose W bit 
Is clear causes a trap. 

• When at user level, st.c to certain control regis- 
ters is ignored. 

When the i860 XR microprocessor is executing at 
supervisor level, all pages are addressable, but, 
when it Is executing at user level, only pages that 
belong to the user-level are addressable. 

When the I860 XR microprocessor is executing at 
supervisor level, all pages are readable. Whether a 
page is writable depends upon the write-protection 
mode controlled by WP of epsr: 

WP = 0 All pages are writable. 

WP =1 A write to a page whose W bit is 

clear causes a trap. 

When the i860 XR microprocessor is executing at 
user level, only pages that belong to user level and 
are marked writable are actually writable; pages that 
belong to supervisor level are neither readable nor 
writable from user level. 

2.4.4.4 Write-Through Bit 

The i860 XR microprocessor does not implement a 
write-through caching policy for the on-chip data 
cache; however, the WT (write-through) bit in the 
second-level page-table entry does determine inter- 
nal caching policy. If WT is set in a PTE, on-chip 
caching of data from the corresponding page is in- 
hibited. The i860 XR CPU may place pages having 
WT = 1 into the instruction cache. Future imple- 
mentations of the i860 XR architecture may adhere 
to a write-through data caching policy. Therefore, 
they may cache pages having the WT bit of the PTE 
set. If WT is clear, the normal write-back policy is 
applied to data from the page in the on-chip caches. 
The WT bit of page directory entries is not refer- 
enced by the processor, but is reserved. 

The WT bit is independent of the CD bit; therefore, 
data may be placed in a second-level coherent 
cache, but kept out of the on-chip caches. 


2.4.4.5 Cache Disable Bit 

If the CD (cache disable) bit in the second-level 
page-table entry is set, data from the associated 
page is not placed in Instruction or data caches. 
Clearing CD permits the cache hardware to place 
data from the associated page into caches. The CD 
bit of page directory entries Is not referenced by the 
processor, but is reserved. 

To control external caches, the i860 XR microproc- 
essor outputs on Its PTB pin either the CD or WT bit. 
The PBM bit of epsr determines which bit is output. 

2.4.4.6 Accessed and Dirty Bits 

The A (accessed) and D (dirty) bits provide data 
about page usage in both levels of the page tables. 

The I860 XR microprocessor sets the corresponding 
accessed bits in both levels of page tables before a 
read or write operation to a page. The processor 
tests the dirty bit in the second-level page table be- 
fore a write to an address covered by that page table 
entry, and, under certain conditions, causes traps. 
The trap handler then has the opportunity to main- 
tain appropriate values In the dirty bits. The dirty bit 
in directory entries is not tested by the i860 XR mi- 
croprocessor. The precise algorithm for using these 
bits Is specified in Section 2.4.5. 

An operating system that supports paged virtual 
memory can use these bits to determine what pages 
to eliminate from physical memory when the de- 
mand for memory exceeds the physical memory 
available. The D and A bits In the PTE (page-table 
entry) are normally Initialized to zero by the operat- 
ing system. The processor sets the A bit when a 
page is accessed either by a read or write operation. 
When a data- or instruction-access fault occurs, the 
trap handler sets the D bit If an allowable write is 
being performed, then re-executes the Instruction. 

The operating system is responsible for coordinating 
its updates to the accessed and dirty bits with up- 
dates by the CPU and by other processors that may 
share the page tables. The i860 XR microprocessor 
automatically asserts the LOCK# signal while set- 
ting the A bit. If an A-bit of a PTE is found not set 
during a locked sequence (created by the lock in^ 
struction), a trap will occur and the processor will not 
update the A-bit. 

2.4.4.7 Combining Protection of Both Levels of 
Page Tables 

For any one page, the protection attributes of its 
page directory entry may differ from those of Its 
page table entry. The i860 XR microprocessor com- 
putes the effective protection attributes for a page 
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by examining the protection attributes in both the 
directory and the page table. Table 2.6 shows the 
effective protection provided by the possible combi- 
nations of protection attributes. 


2.4.5 ADDRESS TRANSLATION ALGORITHM 

The algorithm below defines the translation of each 
virtual address to a physical address. Let DIR, 
PAGE, and OFFSET be the fields of the virtual ad- 
dress; let PFA1 and PFA2 be the page frame ad- 
dress fields of the first and second level page tables 
respectively; DTB is the page directory table base 
address stored in the dirbase register. 

1 . Read the PTE (page table entry) at the physical 
address formed by DTB:DIR:00. 

2. If P In the PTE Is zero, generate a data- or instruc- 
. tion-access fault. 

3. If W in the PTE is zero, the operation is a write, 
and either the U-bit of the PSR is set or WP = 1, 
generate a data or Instruction access fault. 

4. If the U-bit in the PTE is zero and the U-bit in the 
psr is set, generate a data or instruction access 
fault. 

5. If A in the PTE is zero, and if the TLB miss oc- 
curred while the bus was locked, generate a 


data or Instruction access fault. (The trap allows 
software to set A to one and restart the se- 
quence. This avoids ambiguity in determining 
what address corresponds to a locked sema- 
phore for external bus hardware use.) 

6. If A in the PTE Is zero, and if the TLB miss oc- 
curred while the bus was not locked, assert 
LOCK#. Re-fetch and check the PTE, set A, and 
store the PTE. Deassert LOCK# during the store. 

7. Locate the PTE at the physical address formed by 
PFA1:PAGE:00. 


8. Perform the P, W, U, and A checks as In steps 2 
through 6 with the second-level PTE. 


9. If D in the PTE is clear and the operation is a 
write, generate a data or instruction access fault. 

10. Form the physical address as PFA2:OFFSET. 

The I860 XR microprocessor looks only in external 
memory for Page Directories and Page Tables, in 
the translation process. The data cache is not 
searched. Therefore, any code which modifies Page 
Directories or Page Tables must keep them out of 
the cache. The tables should be kept in non-cache- 
able memory, or flushed from the cache. 



Table 2.6. Combining Directory and Page Protections 


Page Directory 

Entry 

Page Table 

Entry 

Combined Protection 

User 

Access 

Supervisor 

Access 

U-bit 

W-bit 

U-bit 

W-bIt 

WP = X 

WP = 0 

WP = 1 

0 

0 

0 

0 

N 

R/W 

R 

0 

0 

0 

1 

N 

R/W 

R 

0 

0 

1 

0 

N 

R/W 

R 

0 

0 

1 

1 

N 

R/W 

R 

0 

1 

0 

0 

N 

R/W 

R 

0 

1 

0 

1 

N 

R/W 

R/W 

0 

1 

1 

0 

N 

R/W 

R 

0 

1 

1 

1 

N 

R/W 

R/W 

1 

0 

0 

0 

N 

R/W 

R 

1 

0 

0 

1 

N 

R/W 

R 

1 

0 

1 

0 

R 

R/W 

R 

1 

0 

- 1 

1 

R 

R/W 

R 

1 

1 

0 

0 

N 

R/W 

R 

1 

1 

0 

1 

N 

R/W 

R/W 

1 

1 

1 

0 

R 

R/W 

R 

1 

1 

1 

1 

R/W 

R/W 

R/W 


NOTES: 

N = No access allowed R/W = Both reads and writes allowed 
R = Read access only X = Don’t care 
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The i860 XR microprocessor expects Page Directo- 
ries and Page Tables to be in little endian format. 
The operating system must maintain these tables In 
little endian format by either setting BE = 0 when 
manipulating the tables or by complementing bit 2 of 
the address when loading or storing entries. 

2.4.6 ADDRESS TRANSLATION FAULTS 

The address translation fault is one instance of the 
data-access fault. The instruction causing the fault 
can be re-executed upon returning from the trap 
handler. 


2.4.7 PAGE TRANSLATION CACHE 

For greatest efficiency in address translation, the 
i860 XR microprocessor stores the most recently 
used page-table data in an on-chip cache called the 
TLB (translation lookaside buffer). Only if the neces- 
sary paging Information Is not in the cache must 
both levels of page tables be referenced. 


2.5 Caching and Cache Flushing 

The I860 XR microprocessor has the ability to cache 
instruction, data, and address-translation informa- 
tion in on-chip caches. Caching uses virtual-address 
tags. The effects of mapping two different virtual ad- 
dresses in the same address space to the same 
physical address are undefined. 

Instruction, data, and address-translation caching on 
the i860 XR microprocessor are not transparent. Be- 
cause the data cache uses a write-back protocol, 
writes do not immediately update memory, and 
writes to memory by other bus devices do not up- 
date the cache. Changes to page tables do not auto- 
matically update the TLB, and changes to instruc- 
tions do not automatically update the instruction 
cache. Under certain circumstances, such as I/O 
references, self-modifying code, page-table up- 
dates, or shared data in a multiprocessing system, it 
is necessary to bypass or to flush the caches. The 
i860 XR microprocessor provides the following 
methods for doing this: 

• Bypassing Instruction and Data Caches. If 

deasserted during cache-miss processing, the 
KEN# pin disables instruction and data caching 
of the referenced data. If the CD bit of the associ- 
ated second-level PTE is set, caching of data and 
instructions is disabled. The i860 XR CPU may 
place pages having WT = 1 Into the instruction 


cache. Future implementations of the i860 XR ar- 
chitecture may adhere to a write-through data 
cache policy. Thus, they may cache pages having 
the WT bit of the PTE set. The value of the CD bit 
or the WT bit is output on the PTB pin for use by 
external caches. 

• Invalidating Instruction and Address-Transla- 
tion Caches. Storing to the dirbase register with 
the ITI bit set invalidates the contents of the In- 
struction and address-translation caches. This bit 
should be set when modifying a page table, when 
modifying a page containing instructions, or when 
changing the DTB field of dirbase or the WP bit 
of the epsr. Note that in order to make the in- 
struction or address-translation caches consist- 
ent with the data cache, the data cache must be 
flushed before invalidating the other caches. 

NOTE: 

The mapping of the page containing the 
currently executing Instruction and the 
next six instructions should not be differ- 
ent In the new page tables when st.c dir- 
base changes DTB or activates ITI. The 
six Instructions following the st.c should 
be nops and should lie in the same page 
as the st.c. 

® Flushing the Data Cache. The data cache is 
flushed by a software routine using the flush In- 
struction. The data cache must be flushed prior to 
invalidating the instruction or address-translation 
caches (as controlled by the ITI bit of dirbase) or 
enabling or disabling address translation (via the 
ATE bit). The data cache does not need flushing 
if the program is modifying only the P, U, W, A, or 
D bits of a PTE (as long as the Page Frame Ad- 
dress Is not changed and the PTE itself was not 
in the data cache.) The i860 XR CPU does not 
check these protection bits on cache line write- 
back. Thus, a trap handler can service a DAT for 
D-bit-zero by setting D = 1 and then ITI = 1. In 
the case of setting the P or A bits active, there is 
no need to invalidate or flush any caches be- 
cause the processor does not load entries into 
the TLB that have P = 0 or A == 0. The i860 XR 
microprocessor searches only external memory 
for Page Directories and Page Tables in the 
translation process. The data cache is not 
searched. Therefore, Page Tables and Directo- 
ries should be kept in non-cacheable memory, or 
flushed from the cache by any code which ac- 
cesses them. 
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2.6 Instruction Set 

Table 2.7 shows the complete set of instructions 
grouped by function within processing unit. Refer to 
Section 8 for an algorithmic definition of each in- 
struction. 

The architecture of the i860 XR microprocessor 
uses parallelism to increase the rate at which opera- 
tions may be introduced into the unit. Parallelism in 
the i860 XR microprocessor is not transparent; rath- 
er, programmers have complete control over paral- 
lelism and therefore can achieve maximum perform- 
ance for a variety of computational problems. 


2.6.1 PIPELINED AND SCALAR OPERATIONS 

One type of parallelism used within the floating-point 
unit is “pipelining”. The pipelined architecture treats 
each operation as a series of more primitive opera- 
tions (called “stages”) that can be executed in par- 
allel. Consider just the floating-point adder unit as an 
example. Let A represent the operation of the adder. 
Let the stages be represented by Ai, A 2 , and A 3 . 
The stages are designed such that A| + 1 for one ad- 
der instruction can execute in parallel with Aj for the 
next adder instruction. Furthermore, each A| can be 
executed in just one clock. The pipelining within the 
multiplier and graphics units can be described simi- 
larly, except that the number of stages may be differ- 
ent. 

Figure 2.12 illustrates three-stage pipelining as 
found in the floating-point adder (also In the floating- 
point multiplier when single-precision input operands 
are employed). The columns of the figure represent 
the three stages of the pipeline. Each stage holds 
intermediate results and also (when introduced into 
first stage by software) holds status information per- 
taining to those results. The figure assumes that the 
instruction stream consists of a series of consecu- 
tive floating-point instructions, all of one type (I.e. all 
adder instructions or all single-precision multiplier in- 
structions). The instructions are represented as i, 
I + 1, etc. The rows of the figure represent the states 
of the unit at successive clock cycles. Each time a 
pipelined operation is performed, the result of the 
last stage of the pipeline is stored in the destination 
register fdest, the pipeline Is advanced one stage, 
and the Input operands fsrd and fsrc2 are trans- 
ferred to the first stage of the pipeline. 


In the i860 XR microprocessor, the number of pipe- 
line stages ranges from one to three. A pipelined 
operation with a three-stage pipeline stores the re- 
sult of the third prior operation. A pipelined operation 
with a two-stage pipeline stores the result of the sec- 
ond prior operation. A pipelined operation with a 
one-stage pipeline stores the result of the prior oper- 
ation. 


There are four floating-point pipelines: one for the 
multiplier, one for the adder, one for the graphics 
unit, and one for floating-point loads. The adder 
pipeline has three stages. The number of stages in 
the multiplier pipeline depends on the precision of 
the source operands in the pipeline. Single precision 
has three stages and double precision has two 
stages. The graphics unit has one stage for ail preci- 
sions. The load pipeline has three stages for all pre- 
cisions. 

Changing the FZ (flush zero), RM (rounding mode), 
or RR (result register) bits of fsr while there are re- 
sults in either the multiplier or adder pipeline produc- 
es effects that are not defined. 



2.6.1. 1 Scalar Mode 

In addition to the pipelined execution mode, the i860 
XR microprocessor also can execute floating-point 
instructions in “scalar” mode. Most floating-point In- 
structions have both pipelined and scalar variants, 
distinguished by a bit In the Instruction encoding. In 
scalar mode, the floating-point unit does not start a 
new operation until the previous floating-point oper- 
ation Is completed. The scalar operation passes 
through all stages of its pipeline before a new opera- 
tion is introduced, and the result is stored automati- 
cally. Scalar mode Is used when the next operation 
depends on results from the previous few floating- 
point operations (or when the compiler or program- 
mer does not want to deal with pipelining). 


2.6.1. 2 Pipelining Status Information 

Result status Information in the fsr consists of the 
AA, AI, AO, AU, and AE bits, in the case of the ad- 
der, and the MA, Ml, MO, and MU bits, in the case of 
the multiplier. This Information arrives at the fsr via 
the pipeline in one of two ways: 
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Table 2.7. Instruction Set 


Core Unit 

Mnemonic 

Description 

Load and Store Instructions 

Id.x 

Load integer 

st.x 

Store integer 

fid.y 

F-P load 

pfid.z 

Pipelined F-P load 

fst.y 

F-P store 

pst.d 

Pixel store 

Register to Register Moves 

ixfr 

Transfer integer to F-P register 

Integer Arithmetic instructions 

addu 

Add unsigned 

adds 

Add signed 

subu 

Subtract unsigned 

subs 

Subtract signed 

Shift Instructions 

shl 

Shiftleft 

shr 

Shift right 

shra 

Shift right arithmetic 

shrd 

Shift right double 

Logical Instructions 

and 

Logical AND 

andh 

Logical AND high 

andnot 

Logical AND NOT 

andnoth 

Logical AND NOT high 

or 

Logical OR 

orh 

Logical OR high 

xor 

Logical exclusive OR 

xorh 

Logical exclusive OR high 

Control-Transfer Instructions 

trap 

Software trap 

intovr 

Software trap on integer overflow 

br 

Branch direct 

bri 

Branch indirect 

be 

Branch on CC 

bc.t 

Branch on CC taken 

bnc 

Branch on not CC 

bnc.t 

Branch on not CC taken 

bte 

Branch if equal 

btne 

Branch if not equal 

bla 

Branch on LCC and add 

call 

Subroutine call 

calli 

Indirect subroutine call 

System Control Instructions 

flush 

Cache flush 

Id.c 

Load from control register 

st.c 

Store to control register 

lock 

Begin Interlocked sequence 

unlock 

End interlocked sequence 


Floating-Point Unit 

Mnemonic 

Description 

Register to Register Moves 

fxfr 

Transfer F-P to integer register 

F-P Multiplier Instruction 

fmul.p 

pfmul.p 

pfmulS.dd 

fmlow.p 

frep.p ' 

frsqr.p 

F-P multiply 

Pipelined F-P multiply 

3-Stage pipelined F-P multiply 

F-P multiply low 

F-P reciprocal 

F-P reciprocal square root 

F-P Adder Instructions 

fadd.p 

pfadd.p 

famov.r 

pfamov.r 

fsub.p 

pfsub.p 

pfgt.p 

pfeq.p 

fix.p 

pfix.p 

ftrunc.p 

pftrunc.p 

F-P add 

Pipelined F-P add 

F-P adder move 

Pipelined F-P adder move 

F-P subtract 

Pipelined F-P subtract 

Pipelined F-P greater-than compare 
Pipelined F-P equal compare 

F-P to integer conversion 

Pipelined F-P to integer conversion 
F-P to integer truncation 

Pipelined F-P to integer truncation 

Dual-Operation instructions 

pfam.p 

pfsm.p 

pfmam.p 

pfmsm.p 

Pipelined F-P add and multiply 
Pipelined F-P subtract and multiply 
Pipelined F-P multiply with add 
Pipelined F-P multiply with subtract 

Long Integer Instructions 

fisub.z 

pfisub.z 

fladd.z 

pfiadd.z 

Long-Integer subtract 

Pipelined long-integer subtract 
Long-Integer add 

Pipelined long-integer add 

Graphics Instructions 

fzchks 

pfzchks 

fzchki 

pfzchki 

faddp 

pfaddp 

faddz 

pfaddz 

form 

pform 

1 6-bit Z-buffer check 

Pipelined 1 6-bit Z-buffer check 

32-bit Z-buffer check 

Pipelined 32-bit Z-buffer check 

Add with pixel merge 

Pipelined add with pixel merge 

Add with Z merge 

Pipelined add with Z merge 

OR with MERGE register 

Pipelined OR with MERGE register 


Assembler Pseudo-Operations 

Mnemonic 

Description 

mov 

Integer register-register move 

fmov.r 

F-P reg-reg move 

pfmov.r 

Pipelined F-P reg-reg move 

nop 

Core no-operation 

fnop 

F-P no-operation 

pfie.p 

Pipelined F-P less-than or equal 
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Figure 2.12. Pipelined Instruction Execution 


1 . It is calculated by the last stage of the pipeline. 
This is the normal case. 

2. It is propagated from the first stage of the pipe- 
line. This method is used when restoring the state 
of the pipeline after a preemption. When a store 
instruction updates the fsr and the value of the 
U bit in the word being written Into the fsr is set, 
the store updates the result status bits in the first 

. stage of both the adder and multiplier pipelines. 
When software changes the result-status bits of 
the first stage of a particular unit (multiplier or ad- 
der), the updated result-status bits are propagat- 
ed one stage for each pipelined floating-point op- 
eration for that unit. In this case, each stage of the 
adder and multiplier pipelines holds its own copy 
of the relevant bits of the fsr. When they reach 
the last stage, they override the normal result- 
status bits computed from the last stage result. 


At the next floating-point instruction (or at certain 
core instructions), after the result reaches the last 
stage, the I860 XR microprocessor traps if any of the 
status bits of the fsr indicate exceptions. Note that 
the Instruction that creates the exceptional condition 
Is not the Instruction at which the trap occurs. 

2.6.1. 3 Precision in the Pipelines 

In pipelined mode, when a floating-point operation is 
initiated, the result of an earlier pipelined floating- 
point operation is returned. The result precision of 
the current instruction applies to the operation being 
Initiated. The precision of the value stored In fdest Is 
that which was specified by the Instruction that initia- 
ted that operation. 
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OP 
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63 

d.FP-OP or CORE-OP 

CORE-OP 

d.FP-OP 

CORE-OP 

FP-OP 

CORE-OP 

FP-OP 


OP 


OP 


31 


ENTER DUAL- 
INSTRUCTION MODE. 
INITIATE EXIT FROM 
DUAL-INSTRUCTION MODE. 


LEAVE DUAL- 
INSTRUCTION MODE. 


63 

OP 

d.FP-OP 

FP-OP 

CORE-OP 

FP-OP 


OP 

OP 


TEMPORARY DUAL- 
INSTRUCTION MODE 
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Figure 2.13. Dual-Instruction Mode Transitions 


If fdest Is the same as fsrd or fsrc2, the value being 
stored in fdest is used as the Input operand. In this 
case, the precision of fdest must be the same as the 
source precision. 

The multiplier pipeline has two stages when the 
source operand is double-precision and three stages 
when the precision of the source operand is single. 
This means that a pipelined multiplier operation 
stores the result of the second previous multiplier 
operation for double-precision inputs and third previ- 
ous for single-precision inputs (except when chang- 
ing precisions). 

2.6.1. 4 Transition between Scalar and Pipelined 
Operations 

When a scalar operation is executed, it passes 
through all stages of the pipeline; therefore, any un- 
stored results In the affected pipeline are lost. To 
avoid losing information, the last pipelined opera- 
tions before a scalar operation should be dummy 
pipelined operations that unload unstored results 
from the affected pipeline. 


After a scalar operation, the values of all pipeline 
stages of the affected unit (except the last) are un- 
defined. No spurious result-exception traps result 
when the undefined values are subsequently stored 
by pipelined operations; however, the values should 
not be referenced as source operands. 

For best performance a scalar operation should not 
immediately precede a pipelined operation whose 
fdest Is nonzero. 


2.6.2 DUAL-INSTRUCTION MODE 

Another form of parallelism results from the fact that 
the I860 XR microprocessor can execute both a 
floating-point and a core instruction simultaneously. 
Such parallel execution is called dual-instruction 
mode. When executing in dual-instruction mode, the 
instruction sequence consists of 64-bit aligned in- 
structions with a floating-point instruction in the low- 
er 32 bits and a core instruction In the upper 32 bits. 
Table 2.7 identifies which instructions are executed 
by the core unit and which by the floating-point unit. 
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Programmers specify dual-instruction mode either 
by including in the mnemonic of a floating-point in- 
struction a d. prefix or by using the Assembler direc- 
tives .dual . . . .endduai. Both of the specifications 
cause the D-bit of floating-point instructions to be 
set. If the i860 XR microprocessor is executing in 
single-instruction mode and encounters a floating- 
point instruction with the D-bit set, one more 32-bit 
instruction is executed before dual-mode execution 
begins. If the i860 XR microprocessor is executing in 
dual-instruction mode and a floating-point instruction 
is encountered with a clear D-bit, then one more pair 
of instructions is executed before resuming single-in- 
struction mode. Figure 2.13 illustrates two variations 
of this sequence of events: one for extended se- 
quences of dual-instructions and one for a single in- 
struction pair. 

When a 64-bit dual-instruction pair sequentially fol- 
lows a delayed branch instruction in dual-instruction 
mode, both 32-bit instructions are executed. 

2.6.3 DUAL-OPERATION INSTRUCTIONS 

Special dual-operation floating-point instructions 
(add-and-multiply, subtract-and-multiply) use both 
the multiplier and adder units within the floating- 
point unit in parallel to efficiently execute such com- 
mon tasks as evaluating systems of linear equa- 
tions, performing the Fast Fourier Transform (FFT), 
and performing graphics transformations. 

The instructions pfam fsrd, fsrc2, fdest (add and 
multiply), pfsm fsrd, f$rc2, fdest (subtract and mul- 
tiply), pfmam fscr1, fsrc2, fdest (multiply and add), 
and pfmsm fsrd, fsrc2, /ofesf (multiply and subtract) 
initiate both an adder operation and a multiplier op- 
eration. Six operands are required, but the instruc- 
tion format specifies only three operands; therefore, 
there are special provisions for specifying the oper- 
ands. These special provisions consist of: 

• Three special registers (KR, Kl, and T), that can 
store values from one dual-operation instruction 
and supply them as Inputs to subsequent dual- 
operation instructions. 

1 . The constant registers KR and Kl can store the 
value of fsrd and subsequently supply that 
value to the multiplier pipeline in place of fsrd. 

2. The transfer register T can store the last stage 
result of the multiplier pipeline and subse- 
quently supply that value to the adder pipeline 
in place of fsrd. 

• A four-bit data-path control field in the opcode 
(DPC) that specifies the operands and loading of 
the special registers. 

1 . Operand-1 of the multiplier can be KR, Kl, or 
fsrd. 

2. Operand-2 of the multiplier can be fsrc2 or the 
last stage result of the adder pipeline. 


3. Operand-1 of the adder can be fsrd, the 
T-register, or the last stage result of the adder 
pipeline. 

4. Operand-2 of the adder can be fsrc2, the last 
stage result of the multiplier pipeline, or the 
last stage result of the adder pipeline. 

Figure 2.14 shows all the possible data paths sur- 
rounding the adder and multiplier. A DPC field in 
these instructions select different data paths. Table 
8.8 shows the various encodings of the DPC field. 
Refer to Dual Operation Instructions section in the 
I860 Microprocessor Programmer’s Reference Man- 
ual for pictorial description. 


SRC1 SRC2 RDEST 
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Figure 2.14. Dual-Operation Data Paths 

Note that the mnemonics pfam.p, pfsm.p, 
pfmam.p, and pfmsm.p are never used as such in 
the assembly language; these mnemonics are used 
here to designate classes of related instructions. 
Each value of DPC has a unique mnemonic associ- 
ated with It. 


2.7 Addressing Modes 

Data access is limited to load and store instructions. 
Memory addresses are computed from two fields of 
load and store instructions: isrd and isrc2. 

1. isrd either contains the Identifier of a 32-blt Inte- 
ger register or contains an immediate 16-bit ad- 
dress offset. 

2. isrc2 always specifies a register. 
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Table 2.8. Types of Traps 


Type 

Indication 

Caused by 

PSR, EPSR 

FSR 

Condition 

Instruction 

Instruction 

Fault 

IT OF 

IL 


Software traps 

Missing unlock 

trap, intovr 

Any 

Floating 

Point 

Fault 

FT 

SE 

AO, MO 
AU, MU 
Al, Ml 

Floating-point source exception 
Floating-point result exception 
overflow 
underflow 
inexact result 

Any M- or A-unit except fmlow 

Any M- or A-unit except fmlow, pfgt, 
and pfeq. Reported on any F-P 
instruction plus pst, fst, and 
sometimes fid, pfid, ixfr 

Instruction 
Access Fault 

lAT 


Address translation exception 
during instruction fetch 

Any 

Data Access 
Fault 

DAT" 


Load/store address translation 
exception 

Misaligned operand address 
Operand address matches 
db register 

Any load/store 

Any load/store 

Any load/store 

Interrupt 

IN 

External interrupt 

Reset 

No trap bits set 

Hardware RESET signal 


NOTES: 

*These cases can be distinguished by examining the operand addresses. 

The IL bit of the epsr must be checked by the trap handler to tell if the bus is currently in a locked sequence. 


Because either isrd or isrc2 may be null (zero), a 
variety of useful addressing modes result: 

offset + register Useful for accessing fields within 
a record, where register points 
to the beginning of the record. 
Useful for accessing items in a 
stack frame, where register is 
r3, the register used for pointing 
to the beginning of the stack 
frame. 

register + register Useful for two-dimensional ar- 
rays or for array access within 
the stack frame. 

register Useful as the end result of any 

arbitrary address calculation. 

offset Absolute address into the first or 

last 32K of the logical address 
space. 

In addition, the floating-point load and store instruc- 
tions may select autoincrement addressing. In this 
mode isrc2 is replaced by the sum of isrd and isrc2 
after performing the load or store. This mode makes 
stepping through arrays more efficient, because it 
eliminates one address-calculation instruction. 


2.8 Traps and interrupts 

Traps are caused by exceptional conditions detect- 
ed In programs or by external interrupts. Traps 
cause interruption of normal program flow to exe- 


cute a special program known as a trap handler. 
Traps are divided into the types shown in Table 2.8. 
Interrupts and traps start execution in single instruc- 
tion mode at virtual address OxFFFFFFOO in supervi- 
sor level (U = 0). 

2.8.1 TRAP HANDLER INVOCATION 

This section applies to traps other than reset. When 
a trap occurs, execution of the current instruction is 
aborted. The instruction is restartable. The proces- 
sor takes the following steps while transferring con- 
trol to the trap handler: 

1 . Copies U (user mode) of the psr Into PU (previous 

2. Copies IM (interrupt mode) into PIM (previous IM). 

3. Sets U to zero (supervisor mode). 

4. Sets IM to zero (interrupts disabled). 

5. If the processor is In dual instruction mode, it sets 
DIM; otherwise it clears DIM. 

6. If the processor is in single-instruction mode and 
the next Instruction will be executed in dual- 
instruction mode or if the processor is in dual-in- 
struction mode and the next instruction will be 
executed in single-instruction mode, DS is set; 
otherwise, It is cleared. 

7. The appropriate trap type bits in psr are set (IT, 
IN, lAT, DAT, FT). Several bits may be set if the 
corresponding trap conditions occur simulta- 
neously. 
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8. An address is placed in the fault instruction regis- 
ter (fir) to help locate the trapped instruction. In 
single-instruction mode, the address in fir is the 
address of the trapped instruction itself. In dual-in- 
struction mode, the address in fir is that of the 
floating-point half of the dual instruction. If an in- 
struction or data access fault occurred, the asso- 
ciated core instruction is the high-order half of the 
dual instruction (fir + 4). In dual-instruction 
mode, when a data access fault occurs in the ab- 
sence of other trap conditions, the floating-point 
half of the dual instruction will already have been 
executed. 

The processor begins executing the trap handler 
by transferring execution to virtual address 
OxFFFFFFOO. The trap handler begins execution in 
single-instruction mode. The trap handler must ex- 
amine the trap-type bits in psr (IT, IN, lAT, DAT, FT) 
to determine the cause or causes of the trap. 


2.8.3 FLOATING-POINT FAULT 

The floating-point fault is reported on floating-point 
Instructions, pst, fst, and sometimes fid, pfid, ixfr. 
The floating-point faults of the i860 XR microproces- 
sor support the floating-point exceptions defined by 
the IEEE standard as well as some other useful 
classes of exceptions. The I860 XR microprocessor 
divides these into two classes: source exceptions 
and result exceptions. The numerics library supplied 
by Intel provides the IEEE standard default handling 
for all these exceptions. 


2.8.3. 1 Source Exception Faults 


When used as inputs to the multiplier or adder, ail 
exceptional operands, including infinities, denormal- 
ized numbers and NaNs, cause a floating-point fault 
and set SE in the fsr. Source exceptions are report- 
ed on the instruction that initiates the operation. For 
pipelined operations, the pipeline is not advanced. 



2.8.2 INSTRUCTION FAULT 

This fault Is caused by any of the following condi- 
tions. In all cases the processor sets the IT bit be- 
fore entering the trap handler. 

1 . By the trap instruction. When trap is executed in 
dual-instruction mode, the floating-point compan- 
ion of the trap instruction is not executed before 
the trap is taken. 

2. By the intovr instruction. The trap occurs only if 
OF in epsr Is set when intovr is executed. The 
trap handler should clear OF before returning. 
When intovr causes a trap in dual-instruction 
mode, the floating-point companion of the intovr 
instruction is completely executed before the trap 
is taken. 

3. By violation of lock/unlock protocol, explained be- 
low. (Note that trap and intovr should not be 
used within a locked sequence; otherwise, it 
would be difficult to distinguish between this and 
the prior cases.) 

The lock protocol requires the following sequence 
of activities: 

l.lock 

2. Any load or store instruction that misses the 
cache 

3. unlock 

4. Any load or store instruction (regardless of 
whether it misses the cache) 

There may be other instructions between any of 
these steps. The bus is locked after step 2, and re- 
mains locked until step 4. Step 4 must follow step 1 
by 30 instructions or less, otherwise the instruction 
trap occurs. In case of a trap, IL is also set. If the 
load or store instruction in step 2 hits the cache, the 
sequence is legal, but the bus is not locked. 


The SE value Is undefined for faults on fid, pfId, fst, 
pst, and ixfr instructions when in single-instruction 
mode or when in dual-instruction mode and the com- 
panion Instruction Is not a multiplier or adder opera- 
tion. 

2.8.3.2 Result Exception Faults 

The class of result exceptions includes any of the 
following conditions: 

• Overflow. The absolute value of the rounded 
true result would exceed the largest positive finite 
number In the destination format. 

• Underflow (when FZ is clear). The absolute val- 
ue of the rounded true result would be smaller 
than the smallest positive finite number in the 
destination format. 

• Inexact result (when Tl is set). The result is not 
exactly representable In the destination format. 
For example, the fraction Va cannot be precisely 
represented in binary form. This exception occurs 
frequently and Indicates that some (generally ac- 
ceptable) accuracy has been lost. 

The point at which a result exception is reported de- 
pends upon whether pipelined operations are being 
used: 

• Scalar (nonpipelined) operations. Result ex- 
ceptions are reported on the next floating-point, 
fst.x, or pst.x (and sometimes fid, pfid, ixfr) in- 
struction after the scalar operation. When a trap 
occurs, the last stage of the affected unit con- 
tains the result of the scalar operation. 

• Pipelined operations. Result exceptions are re- 
ported when the result is in the last stage and the 
next floating-point, fst.x or pst.x (and sometimes 
fid, pfid, ixfr) Instruction is executed. When a 
trap occurs, the pipeline Is not advanced, and the 
last stage results (that caused the trap) remain 
unchanged. 
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When no trap occurs (either because FTE is clear or 
because no exception occurred), the pipeline is ad- 
vanced normally by the new floating-point operation. 

The result-status bits of the affected unit are unde- 
fined until the point that result exceptions are report- 
ed. At this point, the last stage result-status bits (bits 
29..22 and 16..9 of the fsr) reflect the values in the 
last stages of both the adder and multiplier. For ex- 
ample, if the last stage result in the multiplier has 
overflowed and a pipelined floating-point pfadd is 
started, a trap occurs and MO Is set. 

For scalar operations, the RR bits of fsr specify the 
register In which the result was stored. RR is updat- 
ed when the scalar instruction is initiated. The trap, 
however, occurs on a subsequent instruction. Pro- 
grammers must prevent intervening stores to fsr 
from modifying the RR bits. Prevention may take one 
of the following forms: 

• Before any store to fsr when a result exception 
may be pending, execute a dummy floating-point 
operation to trigger the result-exception trap. 

• Always read from fsr before storing to it, and 
mask updates so that the RR bits are not 
changed. 

For pipelined operations, RR Is cleared and the re- 
sult is In the last stage of the pipeline of the appro- 
priate unit. The trap handler must flush the pipeline, 
saving the results and the status bits. 

In either pipelined or scalar mode, the trap handler 
must then compute the trapping result. In either 
case, the result has the same fraction as the true 
result and has an exponent which is the low-order 
bits of the true result. The trap handler can inspect 
the result, compute the result appropriate for that 
instruction (a NaN or an Infinity, for example), and 
store the correct result. The result is either stored in 
the register specified by RR (if nonzero) or (if RR = 
0) the trap handler must reload the pipeline with the 
saved results and status bits. 

Result exceptions may be reported for both the ad- 
der and multiplier units at the same time. In this 
case, the trap handler should fix up the last stage of 
both pipelines, 

2.8.4 INSTRUCTION ACCESS FAULT 

This trap occurs during address translation for in- 
struction fetches in any of these cases: 

• The address fetched is In a page whose P (pres- 
ent) bit in the page table is clear (not present). 

• The address fetched Is in a supervisor mode 
page, but the processor is In user mode. 

• The address fetched is in a page whose PTE has 
A = 0, and the access occurs during a locked 
sequence (i.e., between lock and unlock). 


Note that several instructions are fetched at one 
time, either due to Instruction prefetching or to in- 
struction caching. Therefore, a trap handler can 
change from supervisor to user mode and continue 
to execute Instructions fetched from a supervisor 
page. An instruction access trap occurs only when 
the next group of Instructions is fetched from a su- 
pervisor page (up to eight Instructions later). If, in the 
meantime, the handler branches to a user page, no 
Instruction access trap occurs. No protection viola- 
tion results, because the processor does not permit 
data accesses to supervisor pages while running In 
user mode. 

2.8.5 DATA ACCESS FAULT 

This trap results from an abnormal condition detect- 
ed during data operand fetch or store. Such an ex- 
ception can be due only to one of the following caus- 
es: 

• An attempt is being made to write to a page 
whose D (Dirty) bit is clear. 

• A memory operand is misaligned (is not located 
at an address that Is a multiple of the length of 
the data). 

• The address stored In the db register Is equal to 
one of the addresses spanned by the operand. 

• The operand is In a not-present page. 

• An attempt is being made from user level to write 
to a read-only page or to access a supervisor-lev- 
el page. 

• The operand was in a page whose PTE had A = 
0, and the access occurred during a locked se- 
quence. (i.e., between lock and unlock.) 

• Write protection (determined by epsr bit WP = 1 ) 
Is violated In supervisor mode. 

2.8.6 INTERRUPT TRAP 

An Interrupt Is an event that is signaled from an ex- 
ternal source. If the processor is executing with in- 
terrupts enabled (IM set in the psr), the processor 
sets the Interrupt bit IN in the psr, and generates an 
Interrupt trap. Vectored interrupts are Implemented 
by Interrupt controllers and software. 

2.8.7 RESET TRAP 

When the I860 XR microprocessor Is reset, execu- 
tion begins in single-instruction mode at physical ad- 
dress OxFFFFFFOO. This is the same address as for 
other traps. The reset trap can be distinguished from 
other traps by the fact that no trap bits are set. The 
instruction cache is flushed. The bits DPS, BL, and 
ATE in dirbase are cleared. CS8 is initialized by the 
value at the I NT pin at the end of reset. The read- 
only fields of the espr are set to Identify the proces- 
sor, while the IL, WP, and PBM bits are cleared. The 
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bits U, IM, BR, and BW in psr are cleared, as are the 
trap bits FT, DAT, I AT, IN, and IT. All other bits of 
psr and all other register contents are undefined. 


Refer to Table 2.9 for a summary of these initial set- 
tings. 

Table 2.9. Register and Cache Values after Reset 


Registers 

Initial Value 

Integer Registers 

Undefined 

Floating-Point 

Undefined 

Registers 


psr 

U, IM, BR, BW, FT, DAT, lAT, IN, 
IT = 0; others are undefined 

epsr 

IL, WP, PBM, BE = 0; 

Processor Type, Stepping 
Number, DCS are read 
only; others are undefined 

db 

Undefined 

dirbase 

DPS, BL, ATE = 0; others 
are undefined 

fir 

Undefined 

fsr 

Undefined 

KR, Kl, T, 

Undefined 

MERGE 


Caches 

Initial Value 

Instruction Cache 

Flushed 

Data Cache 

Undefined 

TLB 

Flushed 


The software must ensure that the data cache is 
flushed and control registers are properly initialized 
before performing operations that depend on the 
values of the cache or registers. The data cache has 
no “validity” bits, so memory accesses before the 
flush may result in false data cache hits. 

Reset code must initialize the floating-point pipeline 
state to zero with floating-point traps disabled to en- 
sure that no spurious floating-point traps are gener- 
ated. 

After a RESET the I860 XR microprocessor starts 
execution at supervisor level (U = 0). Before branch- 
ing to the first user-level instruction, the RESET trap 
handler or subsequent initialization code has to set 
PU and a trap bit so that an indirect branch instruc- 
tion will copy PU to U, thereby changing to user level. 


2.9 Debugging 

The i860 XR microprocessor supports debugging 
with both data and Instruction breakpoints. The fea- 
tures of the i860 XR architecture that support debug- 
ging include: 

• db (data breakpoint register) which permits speci- 
fication of a data addresses that the I860 XR mi- 
croprocessor will monitor. 


• BR (break read) and BW (break write) bits of the 
psr, which enable trapping of either reads or 
writes (respectively) to the address in db. 

• DAT (data access trap) bit of the psr, which al- 
lows the trap handler to determine when a data 
breakpoint was the cause of the trap. 

• trap instruction that can be used to set break- 
points in code. Any number of code breakpoints 
can be set. The values of the isrcl and isrc2 
fields help Identify which breakpoint has oc- 
curred. 


• IT (instruction trap) bit of the psr, which allows 
the trap handler to determine when a trap 
instruction was the cause of the trap. 


3.0 HARDWARE INTERFACE 

In the following description of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no # Is present after 
the signal name, the signal Is asserted when at the 
high voltage level. 



3.1 Signal Description 

Table 3.1 identifies functional groupings of the pins, 
lists every pin by its identifier, gives a brief descrip- 
tion of its function, and lists some of its characteris- 
tics. All output pins are tristate, except HLDA and 
BREQ. All Inputs are synchronous, except HOLD 
and INT. 


3.1.1 CLOCK (CLK) 

The CLK input determines execution rate and timing 
of the i860 XR microprocessor. Timing of other sig- 
nals Is specified relative to the rising edge of this 
signal. The i860 XR microprocessor can utilize a 
clock rate of 25 MHz, 33.3 MHz or 40 MHz. The 
Internal operating frequency is the same as the ex- 
ternal clock. 


3.1.2 SYSTEM RESET (RESET) 

Asserting RESET for at least 16 CLK periods causes 
initialization of the i860 XR microprocessor. Refer to 
section 3.2 “Initialization” for more details related to 
RESET. 


3.1.3 BUS HOLD (HOLD) AND BUS HOLD 
ACKNOWLEDGE (HLDA) 

These pins are used for i860 XR microprocessor bus 
arbitration. At some clock after the HOLD signal is 
asserted, the i860 XR microprocessor releases con- 
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Table 3.1. Pin Summary 


Pin 

Name 

Function 

Active 

State 

Input/ 

Output 

Execution Control Pins | 

CLK 

CLOCK 


1 

RESET 

System reset 

High 

1 

HOLD 

Bus hold 

High 

1 

HLDA 

Bus hold acknowledge 

High 

0 

BREQ 

Bu^ request 

High 

0 

INT/CS8 

Interrupt, code-size 

High 

1 

Bus Interface Pins 

A31-A3 

Address bus 

High 

0 

BE7#-BE0# 

Byte Enables 

Low 

0 

D63-D0 

Data bus 

High 

I/O 

LOCK# 

Bus lock 

Low 

0 

W/R# 

Write/ Read bus cycle 

High/Low 

0 

NENE# 

NExt NEar 

Low 

0 

NA# 

Next Address request 

Low 

1 

READY# 

Transfer Acknowledge 

Low 

1 

ADS# 

ADdress Status 

Low 

0 

Cache Interface Pins 

KEN# 

Cache ENable 

Low 

1 

PTB 

Page Table Bit 

High 

0 

Testability Pins 

SHI 

Boundary Scan Shift Input 

High 

1 

BSCN 

Boundary Scan Enable 

High 

1 

SCAN 

Shift Scan Path 


1 

Intel-Reserved Configuration Pins | 

CC1-CC0 

Configuration 

High 

1 

Power and Ground Pins 

Vcc 

System power 



Vss 

System ground ^ 




A # after a pin name indicates that the signal is active when at the low voltage level. 


trol of the local bus and puts all bus interface out- 
puts (except BREQ and HLDA) into a floating state, 
then asserts HLDA— all during the same clock peri- 
od. It maintains this state until HOLD is deasserted. 
Instruction execution stops only if required instruc- 
tions or data cannot be read from the on-chip In- 
struction and data caches. 

The time required to acknowledge a hold request Is 
one clock plus the number of clocks needed to finish 
any outstanding bus cycles. HOLD is recognized 
even while RESET or LOCK# Is asserted. 

When leaving a bus hold, the i860 XR microproces- 
sor deactivates HLDA and, in the same clock period, 
initiates a pending bus cycle, if any. 

Hold is an asynchronous Input. 


3.1.4 BUS REQUEST (BREQ) 

This signal is asserted when the I860 XR microproc- 
essor has a pending memory request, even when 
HLDA is asserted. This allows an external bus arbi- 
ter to Implement an “on demand only” policy for 
granting the bus to the i860 XR microprocessor. 
BREQ is asserted the clock after the i860 XR micro- 
processor realizes an internal request for the bus. In 
normal operation, BREQ goes low the clock after 
ADS# goes low for the final pending bus cycle. (Re- 
fer to Figure 4.10 for timing information.) During data 
or instuction cache fills, however, BREQ may be 
deasserted for one or more clocks, due to cache 
and TLB logic. 

3.1.5 INTERRUPT/CODE-SiZE (INT/CS8) 

This Input allows Interruption of the current instruc- 
tion stream. If Interrupts are enabled (IM set in psr) 
when I NT is asserted, the i860 XR microprocessor 
fetches the next Instruction from address 


2-194 











i 860 TM XR MICROPROCESSOR 


[pi^iyiMOKiw 


int^. 


OxFFFFFFOO. To assure that an interrupt is recog- 
nized, I NT should remain asserted until the software 
acknowledges the Interrupt (by writing, for example, 
to a memory-mapped port of an interrupt controller). 
When the bus is not locked, the maximum time be- 
tween the assertion of I NT and the execution of the 
first instruction of the trap handler is ten clocks, plus 
the time for four sets of four pipelined read cycles 
and two sets of four pipelined writes (instruction- 
and data-cache misses and write-back cycles to up- 
date memory), plus the time for twenty nonpipelined 
read cycles (six TLB misses, with eight refetches 
when the A-bit is zero), plus the time for eight non- 
pipelined writes (updates to the A-bit). 

If the bus is locked from a lock instruction, the INT 
pin is ignored and the INT bit of epsr is always zero. 
The lock instruction can only assert LOCK# for 30- 
33 instructions before trapping. 

If INT is asserted during the clock before the falling 
edge of RESET, the eight-bit code-size mode Is se- 
lected. For more about this mode, refer to section 
3.2 “Initialization”. 

INT is an asynchronous Input. 


The address and byte-enable pins are driven until 
either NA# or READY# is asserted. 


3.1,7 DATA PINS (D63-D0) 

The bus Interface has 64 bidirectional data pins 
(D63-D0) to transfer data In eight- to 64-bit quanti- 
ties. Pins D7-D0 transfer the least significant byte; 
pins D63-D56 transfer the most significant byte. 

In read bus cycles, all 64 bits of the data bus are 
latched, even in CS8-mode instruction fetches when 
only the low-order eight bits are used. 


In write bus cycles, the point at which data is driven 
onto the bus depends on the type of the preceding 
cycle. If there was no preceding cycle (i.e. the bus 
was idle), data is drjven with the address. If the pre- 
ceding cycle was a write, data is driven as soon as 
READY # is returned from the previous cycle. If the 
preceding cycle was a read, data is driven one clock 
after READY # is returned from the previous cycle, 
thereby allowing time for the bus to be turned 
around. Data continues to be driven until READY# 
for the current cycle is returned. 



3.1.6 ADDRESS PINS (A31-A3) AND BYTE 
ENABLES (BE7#-BE0#) 

The 29-bit address bus (A31 - A3) identifies address- 
es to a 64-bit location. Separate byte-enable signals 
(BE7#-BE0#) Identify which bytes should be ac- 
cessed within the 64-bit location. In all noncachea- 
ble read cycles (KEN# deasserted), the byte 
enables match the length and address of the re- 
quested data. Cacheable read cycles (KEN# assert- 
ed), however, result in four 64-bit memory cycles to 
fill an entire 32-byte cache line. The BEr?# pins acti- 
vated are those that represent the operand of the 
load instruction that caused the line fill, and these 
same BE/?# pins remain activated for all four cycles 
of the line fill. All 64 bits must be returned for each 
cycle without regard for the BE/?# signals. In all 
write cycles (noncacheable writes as well as cache 
line write-backs) the BE/?# signals indicate the 
bytes that must be written. ' 

Instruction fetches (W/R# is low) are distinguished 
from data accesses by the unique combinations of 
BE7#-BE0# defined in Table 3.2. For an eight-bit 
code fetch in eight-bit code-size (CS8) mode, 
BE2#-BE0# are redefined to be A2-A0 of the ad- 
dress. In this case BE7#-BE3# form the code 
shown In Table 3.2 that identifies an instruction 
fetch. The A2 in the table does not represent a phys- 
ical pin, just a conceptual internal address line value. 
The “x”under A2 for CS8 mode means “not applica- 
ble”, or “don’t care”. All other combinations of byte 
enables indicate data accesses. 


3.1.8 BUS LOCK (LOCK#) 

This signal is used to provide atomic (indivisible) 
read-modify-write sequences In multiprocessor sys- 
tems. A multiprocessor bus arbiter must permit only 
one processor a locked access to the address which 
is on the bus when LOCK# first activates. The sys- 
tem must maintain the lock of that location until 
LOCK# deactivates. 

The i860 XR microprocessor coordinates the exter- 
nal LOCK# signal with the software-controlled BL 
bit of the dirbase register. Programmers do not 
have to be concerned about the fact that bus activity 
is not always synchronous with instruction execu- 
tion. LOCK# is asserted with ADS# for the address 
operand of the first load or store Instruction execut- 
ed after the BL bit is set by the lock instruction. 
Pending bus cycles are locked according to the val- 
ue of the BL bit when the Instruction was executed. 
Even if the BL bit Is changed between the time that 
an instruction generates an internal bus request and 
the time that the cycle appears on the bus, the I860 
XR microprocessor still asserts LOCK# for that bus 
cycle. 

If ADS# Is active when LOCK# deactivates, then 
that request should complete before the hardware 
relinquishes the lock. If ADS# is not active, the lock- 
ing of the location can Immediately end when 
LOCK# deactivates. Of course the simplest arbitra- 
tion hardware can just lock the entire bus against ail 
other accesses during LOCK# assertion through 
RDY# of the cycle in which LOCK# goes inactive. 
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Table 3.2. Identifying Instruction Fetches 


Code 

Fetch 

A2 

BE7# 

BE6# 

BE5# 

BE4# 

BE3# 

BE2# 

BE1# 

BEO# 

Normal 

(Non-CS8) 

0 

1 

1 

1 

1 

1 

0 

1 

0 

Normal 

(Non-CS8) 

1 

1 

0 

1 

0 

1 

1 

1 

1 

CS8 

Mode 

X 

1 

0 

1 

0 

1 

Low-order address bits 


When the BL bit is deasserted with the unlock in- 
struction, LOCK# is deasserted with the next load 
or store but after any pending bus cycles. Between 
locked sequences, at least one cycle of no LOCK# 
Is guaranteed by the behavior of the unlock instruc- 
tion. LOCK# deassertion may occur independently 
of ADS# for the case of a trap or a cache hit after 
unlock. 

The i860 XR microprocessor also asserts LOCK# 
during TLB miss processing for updates of the ac- 
cessed bit In page-table entries. The maximum time 
that LOCK# can be asserted in this case is five 
clocks plus the time required to perform a read-mod- 
ify-write sequence. Instruction fetches do not alter 
the LOCK# pin. 

Between lock and unlock instructions, the INT pin is 
ignored and the INT bit of epsr is zero when read by 
Id.c epsr. The time that Interrupts are disabled Is 
limited by the lock protocol outlined in Section 2.8.2. 

3.1.9 WRITE/READ BUS CYCLE (W/R#) 

This pin specifies whether a bus cycle is a read 
(LOW) or write (HIGH) cycle. It is driven until either 
NA# or READY# Is asserted. 

3.1.10 NEXT NEAR (NENE#) 

This signal allows higher-speed reads and writes in 
the case of consecutive reads and writes that ac- 
cess static column or page-mode DRAMs. The i860 
XR microprocessor asserts NENE# when the cur- 
rent address is in the same DRAM page as the pre- 
vious bus cycle. The i860 XR microprocessor deter- 
mines the DRAM page size by inspecting the DPS 
field in the dirbase register. The page size can 
range from 29 to 216 64-bit words, supporting DRAM 
sizes from 256K X 1, 256K X 4, and up. NENE# is 
never asserted on the next bus cycle after HLDA is 
deasserted. 

3.1.11 NEXT ADDRESS REQUEST (NA#) 

NA# makes address pipelining possible. The sys- 
tem asserts NA# for at least one clock to indicate 
that It is ready to accept the next address from the 
i860 XR microprocessor. NA# may be asserted be- 


fore the current cycle ends. (If the system does not 
implement pipelining, NA# does not have to be acti- 
vated.) The i860 XR microprocessor samples NA# 
every clock, starting one clock after the prior activa- 
tion of ADS#. When NA# is active, the i860 XR 
microprocessor Is free to drive address and bus-cy- 
cle definition for the next pending bus cycle. The 
i860 XR microprocessor remembers that NA# was 
asserted when no internal request is pending; there- 
fore, NA# can be deactivated after the next rising 
edge of the CLK signal. Up to three bus cycles can 
be outstanding simultaneously. 

3.1.12 TRANSFER ACKNOWLEDGE (READY#) 

The system must assert the READY# signal during 
read cycles when valid data Is on the data pins and 
during write cycles when the system has accepted 
data from the data pins. READY # must be asserted 
for at least one clock. Sampling of READY # begins 
in the clock after an ADS# or in the second clock 
after a prior READY#. 

3.1.13 ADDRESS STATUS (ADS#) 

The i860 XR microprocessor asserts ADS# during 
the first clock of each bus cycle to identify the clock 
period during which it begins to assert outputs on 
the address bus. This signal is held active for one 
clock. 

3.1.14 CACHE ENABLE (KEN#) 

The i860 XR microprocessor samples KEN # to de- 
termine whether the data being read for the current 
cache-miss cycle is to be cached. This pin is inter- 
nally NORed with the CD and WT bits to control 
cacheabllity on a page by page basis (refer to Table 
3.3). 

If the address is one that is permitted to be In the 
cache, KEN# must be continuously asserted during 
the sampling period starting from the second rising 
clock edge after ADS# is asserted, through the 
clock NA# or READY# is asserted. The entire 64 
bits of the data bus will be used for the read, regard- 
less of the state of the byte-enable pins. Three addi- 
tional 64-blt bus cycles will be generated to fill the 
rest of the 32-byte cache block. 
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If KEN # is found deasserted at any clock from the 
clock after ADS# through the clock of the first NA# 
or READY#, the data being read will not be cached 
and two scenarios can occur: 1 ) if the cycle Is due to 
data-cache miss, no subsequent cache-fill cycles 
will be generated; 2) If the cycle is due to an Instruc- 
tion-cache miss, additional cycle(s) will be generat- 
ed until the address reaches a 32-byte boundary. To 
avoid caching a line, external hardware must deas- 
sert KEN# during or before the first NA# or 
READY#. 


3.1.15 PAGE TABLE BIT (PTB) 

Depending on the setting of the PBM (page-table bit 
mode) bit of the epsr, the PTB reflects the value of 
either the CD (cache disable) bit or the WT (write 
through) bit of the page-table entry used for the cur- 
rent cycle. When paging is disabled, PTB remains 
inactive. 


Table 3.3. Cacheablllty based on 
KEN# and CD OR WT 


CD OR WT 

KEN# 

Meaning 

0 

0 

Cacheable access 

0 

1 

Noncacheable access 

1 

0 

Noncacheable page 

1 

1 

Noncacheable page 


3.1.16 BOUNDARY SCAN SHIFT INPUT (SHI) 

This pin is used with the testability features. Refer to 
section 3.3. 


3.1.17 BOUNDARY SCAN ENABLE (BSCN) 

This pin Is used with the testability features. Refer to 
section 3.3. 

3.1.18 SHIFT SCAN PATH (SCAN) 

This pin is used with the testability features. Refer to 
section 3.3. 


3.1.19 CONFIGURATION (CC1-CC0) 

These two pins are reserved by Intel. Strap both pins 
LOW. 


3.1.20 SYSTEM POWER (Vcc) AND GROUND 
(Vss) 

The i860 XR microprocessor has 48 pins for power 
and ground. All pins must be connected to the ap- 
propriate low-inductance power and ground signals 
in the system. 



3.2 Initialization 

Initialization of the I860 XR microprocessor is 
caused by assertion of the RESET signal for at least 
16 clocks. Table 3.4 shows the status of output pins 
during the time that RESET is asserted. Note that 
HOLD requests are honored during RESET and that 
the status of output pins depends on whether a 
HOLD request is being acknowledged. 
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Table 3.4. Output Pin Status during Reset 


Pin Name 

Pin Value 

HOLD 

Not 

Acknowledged 

HOLD 

Acknowledged 

ADS#, LOCK# 

HIGH 

Tri-State OFF 

W/R#,PTB 

LOW 

Tri-State OFF 

BREQ 

LOW 

LOW 

HLDA 

LOW 

HIGH 

D63-D0 

Tri-State OFF 

Tri-State OFF 

A31 -A3, 

BE7#-BE0#, 

NENE# 

Undefined 

Tri-State OFF 


After a reset, the i860 XR microprocessor begins ex- 
ecuting at physical address OxFFFFFFOO. The pro- 
gram-visible state of the i860 XR microprocessor af- 
ter reset is detailed in section 2.8.7. 

Eight-bit code-size mode is selected when INT/CS8 
is asserted during the clock before the falling edge 
of RESET. While In eight-bit code-size mode, in- 
struction cache misses are byte reads (transferred 
on D7-D0 of the data bus) instead of eight-byte 
reads. This allows the i860 XR microprocessor to be 
bootstrapped from an eight-bit EPROM. For these 
code reads, byte enables BE2#~BE0# are rede- 
fined to be the low order three bits of the address, 
so that a complete byte address Is available. These 
reads update the instruction cache If KEN# is as- 
serted (refer to section 3.1.14) and are not pipelined 
even if NA# is asserted. While In this mode. Instruc- 
tions must reside in an eight-bit wide memory, while 
data must reside in a separate 64-bit wide memory. 
After the code has been loaded Into 64-bit memory, 
initialization code can initiate 64-bit code fetches by 
clearing the CS8 bit of the dirbase register (refer to 
section 2). Once eight-bit code-size mode is dis- 
abled by software, it cannot be reenabled except by 
resetting the i860 XR microprocessor. 


3.3 Testability 

The I860 XR microprocessor has a boundary scan 
mode that may be used in component- or board-lev- 
el testing to test the signal traces leading to and 
from the i860 XR microprocessor. Boundary scan 
mode provides a simple serial interface that makes it 
possible to test all signal traces with only a few 
probes. Probes need be connected only to CLK, 
BSCN, SCAN, SHI, BREQ, RESET, and HOLD. 

The pins BSCN and SCAN control the boundary 
scan mode (refer to Table 3.5). When BSCN is as- 


serted, the I860 XR microprocessor enters boundary 
scan mode on the next rising clock edge. Boundary 
scan mode can be activated even while RESET is 
active. When BSCN is deasserted while in boundary 
scan mode, the i860 XR microprocessor leaves 
boundary scan mode on the next rising clock edge. 
After leaving boundary scan mode, the internal state 
is undefined; therefore, RESET should be asserted. 


Table 3.5. Test Mode Selection 


BSCN 

SCAN 

Testability Mode 

LO 

LO 

No testability mode selected 

LO 

HI 

(Reserved for Intel) 

HI 

LO 

Boundary scan mode, normal 

HI 

HI 

Boundary scan mode, shift 

SHI as input; BREQ as 
output 


For testing purposes, each signal pin has associated 
with it an internal latch. Table 3.6 indentifles these 
latches by name and classifies them as input, out- 
put, or control. The input and output latches carry 
the name of the corresponding pins. 


Table 3.6. Test Mode Latches 


Input 

Latch 

Output 

Latch 

Associated 

Control 

Latch 

SHI 

BSCN 

SCAN 

RESET 

D0-D63 

D0-D63 

DATAt 

CC1-CC0 

A31-A3 

ADDRt 


NENE# 

NENEt 


PTB# 

PTBt 


W/R# 

W/Rt 


ADS# 

ADSt 


HLDA 

LOCK# 

LOCKt 

READY# 

KEN# 

NA# 

INT/CS8 

HOLD 

BE7#-BE0# 

BEt 


BREQ 



Within boundary scan mode the i860 XR microproc- 
essor operates In one of two submodes; normal 
mode or shift mode, depending on the value of the 
SCAN input. A typical test sequence is . . . 
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1. Enter shift mode to assign values to the latches 
that correspond with the pins. 

2. Enter normal mode. In normal mode the i860 XR 
microprocessor transfers the latched values to 
the output pins and latches the values that are 
being driven onto the input pins. 

3. Reenter shift mode to read the new values of the 
input pins. 


A tester causes entry into this mode for one of two 
purposes: 

1. To assign values to output latches to be driven 
onto output pins upon subsequent entry into nor- 
mal mode. 

2. To read the values of input pins previously latched 
in normal mode. 


3.3.1 NORMAL MODE 


4.0 BUS OPERATION 


When SCAN is deasserted, the normal mode is se- 
lected. For each input pin (RESET, HOLD, 
INT/CS8, NA#, READY#, KEN#, SHI, BSCN, 
SCAN, CC1, and CCO), the corresponding latch is 
loaded with the value that is being driven onto the 
pin. 

The tristate output pins (A31-A3, BE7#-BE0#, 
W/R#, NENE#, ADS#, LOCK#, and PTB) are en- 
abled by the control latches ADDRt (for A31 -A3), 
BEt, W/Rt, NENEt, ADSt, LOCKt, and PTBt. If a con- 
trol latch is set, the corresponding output latches 
drive their output pins; otherwise the pins are not 
driven. 

The I/O pins (D63-D0) are enabled by the control 
latch DATAt, which is similar to the other control 
latches. In addition, when DATAt Is not set, the data 
pins are treated as input pins and their values are 
latched. 


A bus cycle begins when ADS# Is activated and 
ends when READY# is sampled active. READY# is 
sampled one clock after assertion of ADS# and 
thereafter until it becomes active. New cycles can 
start as often as every other clock until three cycles 
are outstanding. A bus cycle is considered outstand- 
ing as long as READY# has not been asserted to 
terminate that cycle. After READY# becomes ac- 
tive, it is not sampled again for the following (out- 
standing) cycle until the second clock after the one 
during which It became active. READY# Is assumed 
to be inactive when it is not sampled. 



With regard to how a bus cycle is generated by the 
i860 XR microprocessor, there are two types of cy- 
cles: pipelined and nonpipelined. Both types of cy- 
cles can be either read or write cycles. A pipelined 
cycle is one that starts while one or two other bus 
cycles are outstanding. A nonpipelined cycle is one 
that starts when no other bus cycles are outstand- 
ing. 


3.3.2 SHIFT MODE 

When SCAN is asserted, the shift mode is selected. 
In shift mode, the pins are organized into a bdundary 
scan chain. The scan chain is configured as a shift 
register that is shifted on the rising edge of CLK. The 
SHI pin is connected to the input of one end of the 
boundary scan chain. The value of the most signifi- 
cant bit of the scan chain is output on the BREQ pin. 
To avoid glitches while the values are being shifted 
along the chain, the tester should assert both the 
RESET and HOLD pins. Then all tristate outputs are 
disabled. The order of the pins within the chain is 
shown in Figure 3.1. 


4.1 Pipelining 

A m-n read or write cycle Is a cycle with a total cycle 
time of m clocks and a cycle-to-cycle time of n 
clocks (m > n). Total cycle time extends from the 
clock in which ADS# is activated to the clock in 
which READY # becomes active, whereas cycle-to- 
cycle time extends from the time that READY# Is 
sampled active for the previous cycle to the time 
that it is sampled active again for the current cycle. 
When m = n, a nonpipelined cycle is implied; m > n 
implies a pipelined cycle. 
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Figure 3.1. Order of Boundary Scan Chain 
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Pipelining may occur for the next bus cycle any time 
the current bus cycle requires more than two clock 
periods to finish (m > 2). If a bus request is pending, 
the next cycle will be Initiated when NA# is sampled 
active, even if the current cycle has not terminated. 
In this case, pipelining occurs. NA# is not recog- 
nized uniti after ADS# has become inactive. 

To allow high transfer rates in large memory sys- 
tems, two-level pipelining Is supported (i.e., there 
may be up to three cycles In progress at one time). 
Pipelining enables a new word of data to be trans- 
ferred every two clocks, even though the total cycle 
time may be up to six clocks. 


4.2 Bus State Machine 

The operation of the bus is described in terms of a 
bus state machine using a state transition diagram. 
Figure 4.1 illustrates the i860 XR microprocessor 
bus state machine. A bus cycle Is composed of two 
or more states. Each bus state lasts for one CLK 
period. 

The i860 XR microprocessor supports up to two lev- 
els of address pipelining. Once it has started the first 
bus cycle. It can generate up to two more cycles as 
long as READY# remains iriactive. To start a new 
bus cycle while other cycles are still outstanding, 
NA# must be active for at least one clock cycle 
starting with the clock after the previous ADS#. 
NA# is latched internally. 

States Tj and Tjk, for j = {1 ,2,3 j and k = ( 1 ,2 i , are 
used to describe the state of the i860 XR microproc- 
essor Bus State Machine. Index j indicates the num- 
ber of outstanding bus cycles while Index k distin- 
guishes the intermediate states for the j-th outstand- 
ing cycle. Therefore there can be up to three out- 


standing cycles, and there are two possible interme- 
diate states for each level of pipelining. Tji is the 
next state after Tj, as long as j cycles are outstand- 
ing. Tj2 is entered when NA# is active but the i860 
XR microprocessor Is not ready to start a new cycle. 

Five conditions have to be met to start a new cycle 
while one or more cycles are already pending; 

1. READY# inactive 

2. NA# having been active 

3. An Internal request pending (BREQ active) 

4. HOLD not active 

5. Fewer than three cycles outstanding 

Note that BREQ is asserted on the clock after the 
i860 XR microprocessor realizes an internal request 
for the bus. 

Upon hardware RESET, the bus control logic enters 
the idle state T| and awaits an internal request for a 
bus cycle. If a bus cycle Is requested while there is 
no hold request from the system, a bus cycle begins, 
advancing to state Ti. On the next cycle, the state 
machine automatically advances to state T^. If 
READY# is active in state Tii, the bus control logic 
returns either to T|, if no new cycle is started, or to 
T-i, if a new cycle request is pending internally. In 
fact, if an internal bus request Is pending each time 
READY# is active, the state machine continues to 
cycle between Tii and Ti. 

However, if READY# is not active but the next ad- 
dress request is pending (as indicated by an active 
NA#), the state machine advances either to state 
T 2 (if an internal bus request is pending, signifying 
that two bus cycles are now outstanding), or to state 
Ti 2 (if no bus internal request Is pending, signifying 
NA# has been found active). Transitions from state 
Ti 2 are similar to those from T-n. 
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If two bus cycles are already outstanding (as indicat- 
ed by T 2 k for k = i 1,2)) and NA# is latched active 
but READY# is not active, one more bus request 
causes entry into state T 3 . Transitions from this 
state are similar to those from T 2 . 

In general, If there Is an Internal bus request each 
time both READY# and NA# are active, the state 


machine continues to oscillate between Ti-i and Ti, 
forj={2,3!. 

When NA# Is sampled active while there is a pend- 
ing bus request, ADS# is activated in the next clock 
period (provided no more than two cycles are al- 
ready outstanding). 
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Internal pending bus requests start new bus cycles 
only if no HOLD request has been recognized. Th is 
entered from the idle state T|, Tii, and Ti 2 - HLDA is 
active in this state. There is a one clock delay to 
synchronize the HOLD input when the signal meets 
the respective minimum setup and hold time require- 
ments. The state machine uses the synchronized 
HOLD to move from state to state. 


4.3 Bus Cycles 

Figures 4.2 through 4.10 illustrate combinations of 
bus cycles. 


4.3.1 NONPIPELINED READ CYCLES 

A read cycle begins with the clock In which ADS# is 
asserted. The I860 XR microprocessor begins driv- 
ing the address during this clock. It samples 
READY# for active state every clock after the first 
clock. A minimum of two clocks is required per cycle. 
Data is latched when READY # is found active when 
sampled at the end of a clock period. Figure 4.2 Il- 
lustrates nonpipelined read cycles with zero wait 
states. 



240296-13 

Figure 4.2. Fastest Read Cycles 
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Figure 4.3. Fastest Write Cycles 


4.3.2 NONPIPELINED WRITE CYCLES 

The ADS# and READY# activity for write cycles 
follows the same logic as that for read cycles, as 
Figure 4.3 illustrates for back-to-back, nonpipelined 
write cycles with zero wait-states. 

The fastest write cycle takes only two clocks to com- 
plete. However, when a read cycle Immediately pre- 
cedes a write cycle, the write cycle must contain a 


wait state, as illustrated in Figure 4.4. Because the 
device being read might still be driving the data bus 
during the first clock of the write cycle, there is a 
potential for bus contention. To help avoid such con- 
tention, the i860 XR microprocessor does not drive 
the data bus until the second clock of the write cy- 
cle. The wait state Is required to provide the addi- 
tional time necessary to terminate the write cycle. In 
other read-write combinations, the i860 XR micro- 
processor does not require a wait state. 
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Figure 4.4. Fastest Read/Write Cycles 
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4.3.3 PIPELINED READ AND WRITE CYCLES 

Figures 4.5 and 4.6 illustrate combinations of non- 
pipelined and pipelined read and write cycles. The 
following description applies to both diagrams. While 
Cycle 1 is still in progress, two new cycles are initiat- 
ed. By the time READY# first becomes active, the 
state machine has moved through states Ti, T-ji, 
T2. T21, and T3. Cycles 3 and 4 show how activating 
READY# terminates the corresponding outstanding 
cycle, and yet activating NA# while there is an inter- 
nal request pending adds a new outstanding cycle. 

In Figure 4.5, Cycle 3 Is a write cycle following a read 
cycle; therefore, one wait state must be Inserted. 
The I860 XR microprocessor does not drive the data 
bus until one clock after the read data is returned 
from the preceding read cycle. During Cycles 3 and 
4, the state machine oscillates between states T3 


and T31 maintaining full bus capacity (two levels of 
pipelining; three outstanding cycles). Cycles 2, 3, 
and 4 in Figure 4.6 are 5-2 cycles; i.e. each requires 
a total cycle time of five clocks while the throughput 
rate is one cycle every two clocks. 

Figure 4.7 illustrates in a more general manner how 
the NA# signal controls pipelining. Cycle 1 is a 2-2 
cycle, the fastest possible. The next cycle cannot be 
started any earlier; therefore, there is no need to 
activate NA# to start the next cycle early. Cycle 2, a 
3-3 read. Is different. Cycle 3 can be started during 
the third state (a wait state) of Cycle 2, and NA# Is 
asserted to accomplish this. 

NA# is not activated following the ADS# clock of 
Cycle 3, thereby allowing Cycle 3 to terminate be- 
fore the start of Cycle 4. As a result. Cycle 4 Is a 
nonpipelined cycle. 


2-205 






i 860 TM XR MICROPROCESSOR 




iny. 



240296-20 



Figure 4.9. Locked Cycles 


When there is no internal bus request, activating 
NA# does not start a new cycle; the i860 XR micro- 
processor, however, remembers that NA# has been 
activated. Figure 4.8 illustrates the situation where 
NA# is active but no Internal bus request is pending. 
NA# Is activated when two cycles are outstanding. 
Because there Is no internal request pending until 
after one idle state, no new bus cycle is started dur- 
ing that period. 

4.3.4 LOCKED CYCLES 

The LOCK# signal Is asserted when the current bus 
cycle Is to be locked with the next bus cycle. Asser- 
tion of LOCK# may be initiated by a program’s set- 
ting the BL bit of the dirbase register using the lock 
instruction (refer to section 2) or by the i860 XR mi- 
croprocessor Itself during page table updates. 

In Figure 4.9, the first read cycle is to be locked with 
the following write cycle. If there were idle states 
between the cycles, the LOCK# signal would re- 
main asserted. This is the case for a read/modify/ 
write operation. Cycle 3 is not locked because 
LOCK# is no longer asserted when Cycle 2 starts. 


4.3.5 HOLD AND BREQ ARBITRATION CYCLES 

The HOLD, HLDA, and BREQ signals permit bus ar- 
bitration between the i860 XR microprocessor and 
another bus master. 

See Figure 4.10. When HOLD is asserted, the i860 
XR microprocessor does not relinquish control of 
the bus until all outstanding cycles are completed. If 
HOLD were asserted one clock earlier, the last i860 
XR microprocessor bus cycle before HLDA would 
not be started. 

HOLD is sampled at the end of the clock In which it 
Is activated. Recommended setup and hold times 
must be met to guarantee sampling one clock after 
external HOLD activation. When HOLD is sampled 
active, a one clock delay for internal synchronization 
follows. Likewise when HOLD is deasserted, there is 
a one-clock delay for internal synchronization before 
HLDA is deasserted. The outputs (except HLDA and 
BREQ) float when HLDA is asserted. 
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Figure 4.10. HOLD, HLDA, and BREQ 


If, during a HOLD cycle, an internal bus request is SET. If INT/CS8 Is sampled active, the i860 XR mi- 
generated, BREQ is activated even though HLDA is croprocessor enters CS8 mode. No Inputs (except 
asserted. It remains active at least until the clock for HOLD and INT/CS8) are sampled during RESET, 
after ADS# is activated for the requested cycle. 

Note that, because HOLD is recognized even while 
RESET is active, the HLDA output signal may also 
4.4 Bus States During RESET become active during RESET. Refer to Table 3.4 

“Output Pin Status during Reset”. 

Figure 4.11 shows how INT/CS8 Is sampled during 
the clock period just before the falling edge of RE- 







5.0 MECHANICAL DATA 

Figures 5.1 and 5.2 show the locations of pins; Tables 5.1 and 5.2 help to locate pin Identifiers. 



Figure 5.1. Pin Configuration— View from Top Side 
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Figure 5.2. Pin Configuration— View from Pin Side 
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Table 5.1. Pin Cross Reference by Location 


Location 

Signal 

Location 

Signal 

Location 

Signai 

Location 

Signai 

A1 

Vcc 

C9 

D47 

J15 

D17 

Q10 

. . .BE6# 

A2 

Vss 

CIO 

D43 

J16 

D14 

Oil 

....BE4# 

A3 

Vcc 

C11 

D39 

J17 

D16 

Q12 

....BEO# 

A4 

Vss 

C12 

D37 

K1 

A21 

Q13 

....BSCN 

A5 

D56 

C13 

D35 

K2 

A18 

Q14 

D1 

A6 

D52 

C14 

D33 

K3 

A16 

Q15 

D2 

A7 

D50 

C15 

D32 

K15 

D13 

Q16 

Vss 

A8 

D48 

C16 

Vss 

K16 

D15 

Q17 

Vcc 

A9 

D46 

C17 

Vcc 

K17 

D12 

R1 

Vss 

A1 0 

D44 

D1 

Vss 

L1 

A19 

R2 

Vcc 

All 

D40 

D2 

Vcc 

L2 

A15 

R3 

Vss 

A12 

D38 

D3 

D62 

L3 

A14 

R4 

Vcc 

A13 

Vcc 

D15 

D31 

L15 

D11 

R5 

A4 

A14 

Vss 

D16 

D30 

L16 

D8 

R6 

..NENE# 

A15 

Vcc 

D17 

Vss 

L17 

DIO 

R7 

....HLDA 

A16 

Vss 

El 

Vcc 

M1 

A17 

R8 

....KEN# 

A17 

Vcc 

E2 

cco 

M2 

A13 

R9 

NA# 

B1 

Vss 

E3 

CC1 

M3 

A11 

RIO 

....BE7# 

B2 ....... . 

Vcc 

E15 

D29 

M15 

D7 

R11 

....BE2# 

B3 

Vss 

E16 

D28 

M16 

D9 

R12 

....BE1# 

B4 

D59 

E17 

D26 

M17 

D6 

R13 

....SCAN 

B5 

D58 

FI 

A31 

N1 

A12 

R14 

DO 

B6 

D54 

F2 

.....A28 

N2 

A10 

R15 

Vss 

B7 

D53 

F3 

A30 

N3 

A9 

R16 

Vcc 

B8 

D49 

F15 

.....D27 

N15 

D5 

R17 

Vss 

B9 

D45 

F16 

D25 

N16 

.D4 

SI 

Vcc 

B10 

D42 

F17 

D24 

N17 

Vcc 

S2 

Vss 

B11 

D41 

G1 

A29 

P1 

Vss 

S3 

Vcc 

B12 

D36 

G2 

A27 

P2 

A8 1 

S4 

Vss 

B13 

D34 

G3 

A26 

P3 

A7 

S5 

Vcc 

B14 

Vcc 

G15 

D23 

P15 

D3 

S6 

,...W/R# 

B15 

Vss 

G16 

D22 

P16 

Vcc 

S7 

....ADS# 

B16 

Vcc 

G17 

D20 

P17 

Vss 

S8 

..LOCK# 

B17 

Vss 

HI 

A25 

Q1 

Vcc 

S9 

..INT/CS8 

Cl 

Vcc 

H2 

A24 

Q2 

Vss 

S10 

....BE5# 

C2 

Vss 

H3 

A22 

Q3 

A6 

S11 

....BE3# 

C3 

D60 

H15. ...... 

D21 

Q4 

A5 

S12 

SHI 

C4 

D63 

H16 

D19 

Q5 

A3 

S13 

...RESET 

C5 
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H17 

D18 

Q6 

.PTB 

S14 

Vss 

C6 

D57 

J1 

A23 

Q7 

....BREQ 

S15 

Vcc 

C7 

D55 

J2 

A20 

Q8 

.READY# 

S16 

Vss 

C8 

D51 

J3 

CLK 

Q9 

....HOLD 

S17 

Vcc 
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Table 5.2. Pin Cross Reference by Pin Name 


Signal 

Location 

Signal 

Location 

Signal 

Location 

Signal 

Location 

A3. 

Q5 

CLK . . . . 

J3 

D41 

B11 

Vcc 

B16 

A4 

R5 

DO 

R14 

D42 

BIO 

Vcc • 

Cl 

A5 

Q4 

D1 

Q14 

D43 

......CIO 

Vcc • • ■ • 

C17 

A6 

Q3 

D2 

Q15 

D44 

A10 

Vcc • • • • 

D2 

A7 ..... 

P3 

D3 

P15 

D45 

B9 

Vcc 

El 

A8 

P2 

D4 

N16 

D46 

A9 

Vcc---- 

N17 

A9 

N3 

D5 

N15 

D47 

C9 

Vcc- - - 

P16 

A10 

N2 

D6 

M17 

D48 

A8 

Vcc---- 

01 

All 

M3 

D7 

M15 

D49 

B8 

Vcc --- 

017 

A12 

N1 

D8 

L16 

D50 

A7 

Vcc 

R2 

A13 

M2 

D9 

M16 

D51 

C8 

Vcc - - - - 

R4 

A14 .... 

L3 

D10.... 

L17 

D52 

A6 

Vcc---- 

R16 

A15 .... 

L2 

D11 .... 

L15 

D53 

B7 

Vcc - - - - 

SI 

A16.... 

K3 

D12.... 

K17 

D54 

B6 

Vcc 

S3 

A17 

Ml 

D13.,.. 

K15 

D55 

C7 

Vcc 

S5 

A18.... 

K2 

D14 .... 

J16 

D56 

A5 

Vcc 

S15 

A19 .... 

LI 

D15 

K16 

D57 

C6 

Vcc - - - - 

S17 

A20 . . . . 

J2 

D16 .... 

J17 

D58 

B5 

Vss 

A2 

A21 . . . . 

K1 

D17 .... 

J15 

D59 

B4 

Vss - - - - 

A4 

A22 . . . . 

H3 

D18 

H17 

D60 

C3 

Vss 

A14 

A23 . . . . 

J1 

D19 

H16 

D61 

C5 

Vss 

A16 

A24 

H2 

D20 .... 

G17 

D62 

D3 

Vss - --- 

B1 

A25 

HI 

D21 

H15 

D63 

C4 

Vss 

B3 

A26 

G3 

D22 .... 

G16 

HLDA . . . . 

R7 

Vss • - • - 

B15 

A27 

G2 

D23 .... 

G15 

HOLD . . . . 

09 

Vss - - - 

B17 

A28 . . . . 

....F2 

D24 

F17 

INT/CS8 , 

S9 

Vss 

C2 

A29 

G1 

D25 

F16 

KEN# ... 

R8 

Vss 

C16 

A30 . . . . 

F3 

D26 

E17 

LOCK# .. 

S8 

Vss - - - - 

D1 

A31 . . . . 

FI 

D27 

F16 

NA# 

R9 

Vss 

D17 

ADS#.. 

S7 

D28 

E16 

NENE# . . 

R6 

Vss 

PI 

BEO# . . 

Q12 

D29 

E15 

PTB 

..Q6 

Vss 

P17 

BE1# .. 

R12 

D30 

D16 

READY# . 

08 

Vss 

02 

BE2# . . 

R11 

D31 

D15 

RESET . . . 

S13 

Vss 

016 

BE3# .. 

...S11 

D32 

Cl 5 

SCAN . . . . 

R13 

Vss - - - - 

R1 

BE4#.. 

Q11 

D33 

C14 

SHI 

S12 

Vss 

R3 

BE5# .. 

S10 

D34 

B13 

Vgc 

A1 

Vss - - - - 

R15 

BE6#.. 

Q10 

D35 

C13 

Vcc 

A3 

Vss 

R17 

BE7# .. 

RIO 

D36 

........B12 

Vcc 

A13 

'Vss---- 

S2 

BREQ.. 

Q7 

D37 

C12 

Vcc 

A15 

Vss 

S4 

BSCN . . 

Q13 

D38 

A12 

Vcc 

A17 

Vss - - - - 

S14 

CCO.... 

E2 

D39 

C11 

Vcc 

B2 

Vss---- 

S16 

CC1 . . . . 

E3 

D40 

All 

Vcc 

B14 

W/R# . 

S6 
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Table 5.3. Ceramic PGA Package Dimension Symbols 


Letter or 

Symbol 

Description of Dimensions 

A 

Distance from seating plane to highest point of body 

Ai 

Distance between seating plane and base plane (lid) 

A2 

Distance from base plane to highest point of body 

A 3 

Distance from seating plane to bottom of body 

B 

Diameter of terminal lead pin 

D 

Largest overall package dimension of length 

Di 

A body length dimension, outer lead center to outer lead center 

ei 

Linear spacing between true lead position centerlines 

L 

Distance from seating plane to end of lead 

Si 

Other body dimension, outer lead center to edge of body 


NOTES: 

1 . Controlling dimension: millimeter. 

2. Dimension “e-i” (“e”) is non-cumulative. 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-0.0430 inch. 

4. Dimensions “B”, “Bi” and “C” are nominal. 

5. Details of Pin 1 identifier are optional. 
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®®®®®®®®®®®®( 
;®,®0 ®®@®@®®@@( 
® ® ®r 



l®®®®®®®®®( 

l®®®®®®®®®®®®( 


r 

2.29 p 

1.52'^ 

45® CHAMFER 
(INDEX CORNER) 


^REF. 


SWAGGED- 

PIN 

(4 PL) 


SEATING .. 
PLANE “1 



plane' 


SEATING 
PLANE"" 
0B (ALL PINS) 

t 1 

)K\ 


n 

t 



SWAGGED 

PIN 

DETAIL 


240296-30 


Family: Ceramic Pin Grid Array Package 

Symbol 

Millimeters 

Inches 

Min 

Max 

Notes 

Min 

Max 

Notes 

A 

3.56 

4.57 


0.140 

0.180 


Ai 

0.64 

1.14 

SOLID LID 

0.025 

0.045 

SOLID LID 

A2 

2.79 

3.56 

SOLID LID 

0.110 

0.140 

SOLID LID 

A 3 

1.14 

1.40 


0.045 

0.055 


B 

0.43 

0.51 


0.017 

0.020 


D 

44.07 

44.83 


1.735 

1.765 


Di 

40.51 

40.77 


1.595 

1.605 


©1 

2.29 

2.79 


0.090 

0.110 


L 

2.54 

3.30 


0.100 

0.130 


N 

168 

# of Pins 

168 

# of Pins 

Si 

1.52 

2.54 


0.060 

OJOO 


ISSUE 

IWS REVX 7/15/88 



Figure 5.3. 168 Lead Ceramic PGA Package Dimensions 


6.0 PACKAGE THERMAL 
SPECIFICATIONS 

For this section, let: 

P = maximum power consumption 

Tc = case temperature 

Ta = ambient air temperature 

^CA thermal resistance from case to ambient air 

^jc = thermal resistance from junction to case 

0JA = thermal resistance from junction to ambient 
air 


The I860 XR microprocessor is specified for opera- 
tion when Tc is within the range of 0°C-85®C. Tc 
may be measured in any environment to determine 
whether the i860 XR microprocessor is within speci- 
fied operating range. The case temperature should 
be measured at the center of the top surface oppo- 
site the pins. 

Ta can be calculated from 0cA (thermal resistance 
from case to ambient) with the following equation: 

Ta = Tc - P*ecA 
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Typical values for Oqa and 0jc at various airflows 
are given in Table 6.1 for the 1.75 sq. in., 168 pin, 
ceramic PGA. Ojc is also shown so that 0 ja can be 
calculated by: 

^CA = ^JA - ^JC 

Note that 0jc with a heatsink differs from Ojc with- 
out a heatsink because case temperature is mea- 
sured differently. Case temperature for 0jc with 
heatsink is measured at the center of the heat fin 
base. Case temperature for without heatsink is 
measured at the center of package top surface. 


Table 6.2 shows the maximum Ta allowable (without 
exceeding Tq) at various airflows and operating fre- 
quencies (fcLK)- 

Note that Ta is greatly improved by attaching “fins” 
or a “heat sink” to the package. P (the maximum 
power consumption) is calculated by using the maxi- 
mum Ice af 5V as tabulated In the DC Characteris- 
tics of section 7. 

Figure 6.1 gives typical Iqc derating with case tem- 
perature. For more information on heat sinks, mea- 
surement techniques, or package characteristics, re- 
fer to Intel Packaging Handbook, order number 
240800. 



Table 6.1. Thermal Resistance (°C/W) ^jc and ^CA 



^JC 

0CA Airflow-ft/min 

m/sec) 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

With 

Heat Sink* 

2 

11 

6 

4 

3.2 

2.5 

I 

2.2 

Without 
Heat Sink 

1.5 

17.5 

13 

11 

9.5 

8.5 

8 


*Nine-fin, unidirectional heat sink (fin dimensions: 0.350" height, 0.040 
width, 0.115" center-to-center spacing, 1.530" length). 
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Table 6.2. Maximum Allowable Ta at Various Airflows 
In "C 



^CLK 

(MHz) 

Airfiow-ft/min (m/sec) 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

Ta with 

Heat Sink* 

25.0 

57.5 

70 

75 

77 

78.8 

79.5 

33.3 

52 

67 

73 

75.5 

77.4 

78.5 

40.0 

49.3 

65.5 

72 

74.6 

76.9 

77.9 

Ta without 
Heat Sink 

25.0 

41.3 

52.5 

57.5 

61.3 

63.8 

65 

33.3 

32.5 

46 

52 

56.5 

59.5 

61 

40.0 

28.1 

42.8 

49.3 

54.1 

57.4 

59 


*Nine-fin unidirectional heat sink (fin dimensions; 0.350" height, 0.040 width, 
0.115" center-to-center spacing, 1.530" length). 


7.0 ELECTRICAL DATA 

Inputs and outputs are TTL compatible, except for 
CLK. All Input and output timings are specified rela- 
tive to the 1.5 volt level of the rising edge of CLK 
and refer to the point that the signals reach 1 .5V. 

7.1 Absolute Maximum Ratings 


Case Temperature Tc under Bias O'C to 85°C 

Storage Temperature — 65°C to + 1 50°C 

Voltage on Any Pin 

with Respect to Ground -0.5 to 6.5V 


NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reiiabiiity. 


7.2 D.C. Characteristics 


Table 7.1. DC Characteristics 

Tc = 0°C to 85"C, Vcc = 5V ± 5% 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

V|L 

Input LOW Voltage 

-0.3 

+ 0.8 

V 


V|H 

Input HIGH Voltage 

2.0 

Vcc + 0.3 

V 


V|LC 

CLK Input LOW Voltage 

-0.3 

+ 0.8 

V 


V|HC 

CLK Input HIGH Voltage 

3.0 

Vcc + 0.3 

V 


VoL 

Output LOW Voltage 


0.45 

V 

(Note 1) 

VoH 

Output HIGH Voltage 

2.4 


V 

(Note 2) 

Ice 

Power Supply Current 






CLK = 25.0 MHz 


500 

mA 

Vcc @5V 


CLK = 33.3 MHz 


600 

mA 

Vcc @5V 


CLK = 40.0 MHz 


650 

mA 

Vcc @5V 

Ili 

Input Leakage Current 


±15 

jixA 

No pullup 
or pulldown 

•lo 

Output Leakage Current 


±15 

ftA 


C|N 

Input Capacitance 


15 

PF 

(Note 3) 

Co 

I/O or Output Capacitance 


15 

PF 

(Note 3) 

CcLK 

Clock Capacitance 


20 

PF 

(Note 3) 


NOTES: 

1. This parameter is measured at 4.0 mA for A31-A3, D63-D0, BE7#-BE0#; at 5.0 mA for all other outputs. 

2. This parameter is measured at 1.0 mA for A31-A3, D63-D0, BE7#-BE0#; at 0.9 mA all other outputs. 

3. These are not tested. They are guaranteed by design characterization. 
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7.3 A.C. Characteristics 


Table 7.2. A.C. Characteristics 

Tc = 0“C to 85°C, Vcc = 5V ±5% 

All timings measured at CLK = 1 .5V unless otherwise specified. 


Symbol 

Parameter 

25 MHz 

33 MHz 

40 MHz 

Notes 

Min 

(ns) 

Max 

(ns) 

Min 

(ns) 

Max 

(ns) 

Min 

(ns) 

Max 

(ns) 

t1 

CLK Period 

40 

125 

30 

125 

25 

125 


t2 

CLK High Time 

6 


5 


3 


at3V 

t3 

CLK Low Time 

8 


7 


5 


at 0.8V 

t4 

CLK Fall Time 


7 


7 


7 

3V-0.8V 

t5 

CLK Rise Time 


7 


7 


7 

0.8V-3V 

tea 

A31-A3, PTB, W/R#, NENE# 
Valid Delay 

3.5 

25 

3.5 

23 

3.5 

19 

50 pF Load 

t6b 

BEn#* Valid Delay 

3.5 

27 

3.5 

25 

3.5 

21 

50 pF Load 

t7 

Float Time, All 

3.5 

40 

3.5 

30 

3.5 

25 

(Note 1) 

t8 

ADS#, BREQ, LOCK#, HLDA 
Valid Delay 

3.5 

22 

3.5 

20 

3.5 

15 

50 pF Load 

t9 

D63-D0 Valid Delay 

3.5 

38 

3.5 

35 

3.5 

31 

50 pF Load 

tio 

Setup Time, All Inputs 

13 


11 


8 


(Note 2) 

tlla 

Hold Time, All Inputs except 

DATA 

4 


4 


3 

1 


(Note 2) 

tllb 

DATA Hold Time 

5 


4 


3 




NOTES: 

1. Float condition occurs when maximum output current becomes less than Ilo in magnitude. Float delay is not tested. 

2. I NT and HOLD are asynchronous inputs. The setup and hold specifications are given for test purposes or to assure 
recognition on a specific rising edge of CLK. 

* n = 0, 1 7 
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TYPICAL* OUTPUT 
DELAY (ns) 
@ 1.5V 



'/R#. nene# 

CK#. HLDA 


25 50 75 100 125 150 

LOAD CAPACITANCE, Cl (pf) 


NOTES: 240296-26 

Graphs are not linear outside the C|_ range shown, 
nom = nominal value given in the AC timing table. 

*Typical part under worst-case conditions. 

Figure 7.2. Typical Output Delay vs Load Capacitance under Worst-Case Conditions 


TYPICAL* OUTPUT 
SLEW TIME (ns) 9 
(0.8 -2.0V) 

6 


25 50 75 100 125 150 

LOAD CAPACITANCE, C, (pf) 

NOTES: 

Graphs are not linear outside the Cl range shown. 

*Typical part under worst-case conditions. 






/ 




Z 

r 

53-DO 



z 


Z 


Z 



X 

i 


A2 

BE 

1-A3, 

7#-BE 

PTB, V 
)# 







Abs#. BREQ, LOCK#, HLDA 


Figure 7.3. Typical Slew Time vs Load Capacitance under Worst-Case Conditions 



NOTES: 

Graphs are not linear outside the frequency range shown, 
*Worst-case supply current at 5V. 


8 12 16 20 24 26 30 34 3840 
FREQUENCY (MHz) 
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8.0 INSTRUCTION SET 

Key to abbreviations: 

For register operands, the abbreviations that describe the operands are composed of two parts. The first part 
describes the type of register: 

c One of the control registers fir, psr, epsr, dirbase, db, or fsr 

/ One of the floating-point registers: fO through f31 

/ One of the integer registers: rO through r31 

The second part identifies the field of the machine instruction into which the operand is to be placed: 

srcl The first of the two source-register designators, which may be either a register or a 16-bit 

immediate constant or address offset. The immediate value is zero-extended for logical 
operations and is sign-extended for add and subtract operations (including addu and subu) 
and for all addressing calculations. 

srcini Same as srcl except that no immediate constant or address offset value is permitted. 

srcis Same as srcl except that the immediate constant is a 5-bit value that is zero-extended to 32 

bits. 

src2 The second of the two source-register designators. 

dest The destination register designator. 

Thus, the operand specifier isrc2, for example, means that an integer register is used and that the encoding of 
that register must be placed in the src2 field of the machine instruction. 

Other (nonregister) operands are specified by a one-part abbreviation that represents both the type of operand 
required and the instruction field into which the value of the operand is placed: 

* const A 1 6-bit immediate constant or address offset that the i860 XR microprocessor sign-extends 

to 32 bits when computing the effective address. 

A signed, 26-bit, immediate, relative branch offset. 

A signed, 16-bit, immediate, relative branch offset. 

A function that computes the target address by shifting the offset (either Ibroff or sbroff) left 
by two bits, sign-extending it to 32 bits, and adding the result to the current instruction pointer 
plus four. The resulting target address may lie anywhere within the address space. 

Unless otherwise specified, floating-point operations accept single- or double-precision 
source operands and produce a result of equal or greater precision. Both Input operands 
must have the same precision. The source and result precision are specified by a two-letter 
suffix to the mnemonic of the operation. 

Other abbreviations Include: 

.p Precision specification .ss, .sd, or .dd (.ds not permitted). Refer to Table 8.1. 

.r Precision specification .ss, .sd, .ds, or .dd. Refer to Table 8.1. 

.V .sd or .dd. Refer to Table 8.1. 

.w .ss or .dd. Refer to Table 8.1 . 

.X .b (8 bits), .s (16 bits), or .1 (32 bits) 

.y .1 (32 bits), .d (64 bits), or .q (128 bits) 

.z .I (32 bits), or .d (64 bits) 


Table 8.1. Precision Specification 


Suffix 

Source 

Precision 

Result 

Precision 

.ss 

single 

single 

.sd 

single 

double 

.dd 

double 

double 

.ds 

double 

single 


Ibroff 

sbroff 

brx 
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mem.x(address) The contents of the memory location indicated by address with a size of x. 

PM The pixel mask, which is considered as an array of eight bits PM [7].. PM [0], where PM[0] is 

the least significant bit. 


8.1 Instruction Definitions in Alphabeticai Order 

adds isrd, isrc2, idest 

idest isrd + isrc2 

OF <— (bit 31 carry ^ bit 30 carry) 

CC set if isrc2 < -isrd (signed) 

CC clear if isrc2 ^ -isrd (signed) 

addu isrd, isrc2, idest 

idest <— isrd + isrc2 
OF <r- bit 31 carry 
CC ^ bit 31 carry 

and isrd, isrc2, idest 

idest <— isrd and isrc2 

CC set if result is zero, cleared otherwise 

andh it const, isrc2, idest 

idest (# const shifted left 16 bits) and isrc2 
CC set if result is zero, cleared otherwise 

andnot isrc 1, isrc2, idest 

idest. <— not isrd and isrc2 

CC set if result is zero, cleared otherwise 

andnoth # const, isrc2, idest 

idest not (#co/7S/ shifted left 16 bits) and isrc2 
CC set if result is zero, cleared otherwise 

be ibroff 

IF CC = 1 

THEN continue execution at brx(lbroff) 

FI 

bc.t ibroff 

IF CC = 1 

THEN execute one more sequential instruction 
continue execution at brx(lbroff) 

ELSE skip next sequential instruction 
FI 

bla isrdni, isrc2, sbroff 

LCC-temp clear If isrc2 < —isrdni (signed) 
LCC-temp set if isrc2 ^ -isrdni (signed) 
isrc2 ^ isrdni + isrc2 
Execute one more sequential instruction 
IF LCC 

THEN LCC ^ LCC-temp 

continue execution at brx(sbroff) 

ELSE LCC ^ LCC-temp 
FI 

bnc ibroff 

IF CC = 0 

THEN continue execution at brx(lbroff) 

FI 

bnc.t ibroff 

IF CC = 0 

THEN execute one more sequential instruction 
continue execution at brx(ibroff) 

ELSE skip next sequential instruction 
FI 


Add Signed 

.Add Unsigned 

Logical AND 

Logical AND High 

Logical AND NOT 

. . .Logical AND NOT High 

Branch on CC 

— Branch on CC, Taken 

. Branch on LCC and Add 

Branch on Not CC 

Branch on Not CC, Taken 
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br Ibroff Branch Direct Unconditionally 

Execute one more sequential instruction. 

Continue execution at brx(lbroff). 


bri [isrdni] Branch Indirect Unconditionally 

Execute one more sequential instruction 
IF any trap bit in psr is set 

THEN copy PU to U, PIM to IM in psr 

clear trap bits 


IF 
THEN 

ELSE 


DS is set and DIM Is reset 
enter dual-instruction mode after executing one 
instruction in single-instruction mode 


IF 
THEN 

ELSE 


FI 


DS is set and DIM Is set 
enter single-instruction mode after executing one 
instruction in dual-instruction mode 
IF DIM Is set 

THEN enter dual-instruction mode 
for next two instructions 
ELSE enter single-instruction mode 
for next two instructions 
FI 


FI 


FI 

Continue execution at address in isrdni 
(The original contents of isrdni is used even if the next Instruction 
modifies isrdni. Does not trap If isrdni is misaligned.) 

bte isrc Is, isrc2, sbroff Branch If Equal 

IF isrds = isrc2 

THEN continue execution at brx(sbroff) 

FI 


btne isrds, isrc2, sbroff 

IF isrds isrc2 

THEN continue execution at brx(sbroff) 

FI 

call ibroff 

r1 address of next sequential instruction + 4 (+8 in dual mode) 
Execute one more sequential instruction 
Continue execution at brx(lbroff) 

call! [isrdni] 

r1 address of next sequential instruction + 4 (+8 In dual mode) 

Execute one more sequential Instruction 
Continue execution at address in isrdni 

(The original contents of isrdni is used even if the next instruction 
modifies isrdni. Does not trap If isrdni \s misaligned. 

The register isrdni must not be r1.) 

fadd.p fsrd, fsrc2, fdest 

fdest ^ fsrd + fsrc2 


Branch If Not Equal 


Subroutine Cali 


Indirect Subroutine Call 


Floating-Point Add 


faddp fsrd, fsrc2, fdest Add with Pixel Merge 

fdest <r- fsrd + fsrc2 

Shift and load MERGE register as defined in Table 8.2 


faddz fsrd, fsrc2, fdest Add with Z Merge 

fdest fsrd + fsrc2 

Shift MERGE right 16 and load fields 31.. 16 and 63..48 

f amov.r fsrc 1, fdest Floating-Point Adder Move 

fdest <r— fsrd 

Send fsrd through the floating-point adder. (Preserves —0 (minus zero) when fsrd is -0. fsrc2 
must be coded as fO by the assembler.) 
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fiadd.w fsrd, fsrc2, fdest Long-Integer Add 

fdest fsrd + fsrc2 

fisub.w fsrd, fsrc2, fdest Long-Integer Subtract 

fdest <— fsrd - fsrc2 


fix.v fsrd, fdest Floating-Point to Integer Conversion 

fdest <— 64- bit value with low-order 32 bits equal to integer part of fsrd rounded 

Floating-Point Load 

f Id.y isrc 1 (isrc2), fdest (Normal) 

f Id.y isrc 1(isrc2) + + , fdest (Autoincrement) 

fdest <— mem.y (fsrd + isrc2) 

IF autoincrement 

THEN isrc2 isrd + isrc2 

FI 


Cache Flush 

flush # const{isrc2) (Normal) 

flush # const(isrc2) + + (Autoincrement) 


Replace block in data cache with address const + isrc2). 
Contents of block undefined. 

IF autoincrement 

THEN isrc2 # const + isrc2 

FI 



fmlow.dd fsrd, fsrc2, fdest Floating-Point Multiply Low 

fdest low-order 53 bits of fsrd mantissa x fsrc2 mantissa 

fdest bit 53 most significant bit of mantissa 

fmov.r fsrd, fdest Floating-Point Reg-Reg Move 

Assembler pseudo-operation 

fmov.ss fsrd, fdest = fiadd.ss fsrd, fO, fdest 
fmov.dd fsrd, fdest = fiadd.dd fsrd, fO, fdest 
fmov.sd fsrd, fdest = famov.sd fsrd, fdest 
fmov.ds fsrd, fdest = famov.ds fsrd, fdest 

fmul.p fsrd, fsrc2, fdest Floating-Point Multiply 

fdest fsrd x fsrc2 


fnop 

Assembler pseudo-operation 

fnop = shrd rO, rO, rO 

form fsrd, fdest 

fdest <- fsrd OR MERGE 
MERGE 0 


Floating-Point No Operation 


OR with MERGE Register 


frcp.p fsrc2, fdest 

fdest 1 /fsrc2 with maximum mantissa error < 2-7 


Floating-Point Reciprocal 


frsqr.p fsrc2, fdest Floating-Point Reciprocal Square Root 

fdest 1 /SORT {fsrc2) with maximum mantissa error < 2-7 

Floating-Point Store 

fsty fdest, isrd (isrc2) (Normal) 

fst.y fdest, isrcl[isrc2)-\- (Autoincrement) 

mem.y (isrc2 + isrd) fdest 
IF autoincrement 
THEN isrc2 isrd + isrc2 
FI 


f sub.p fsrc 1, fsrc2, fdest Floating-Point Subtract 

fdest ^ fsrd - fsrc2 

ftrunc.v fsrd, fdest Floating-Point to Integer Conversion 

fdest <r- 64-blt value with low-order 32 bits equal to integer part of fsrd 

fxf r fsrd, idest T ransfer F-P to Integer Register 

idest fsrd 
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fzchki fsrd, fsrc2, fdest 32-Bit Z-Buffer Check 

Consider fsrd, fsrc2, and fdest as arrays of two 32-bit 

fields fsrd(0)..fsrd('\), fsrc2(0)Jsrc2{\), and fdest(0)Jdest(^) 
where zero denotes the least-significant field. 

PM PM shifted right by 2 bits 
FOR i = 0 to 1 
DO 

PM [i + 6] ^ fsrc2{j) ^ fsrdif^ (unsigned) 
fdesUj) smaller of fsrc2(S\ and /src/(i) 

OD 

MERGE ^ 0 

fzchks fsrd, fsrc2, fdest 16-Blt Z-Buffer Check 

Consider fsrd, fsrc2, and fdest as arrays of four 16-bit 

fields fsrd(0)..fsrd(Z), fsrc2(0)..fsrc2(2i), and fdest(0)Jdest(3) 
where zero denotes the least-significant field. 

PM PM shifted right by 4 bits 
FOR i = 0 to 3 
DO 

PM [i + 4] fsrc2\S) ^ fsrdij) (unsigned) 

/cfesf(l) smaller of fsrc2{S) and fsrdiS) 

OD 

MERGE 0 

intovr Software Trap on Integer Overflow 

If OF in epsr = 1 , generate trap with IT set in psr. 

Ixfr isrdni, fdest Transfer Integer to F-P Register 

fdest isrdni 

Id.c csrc2, idest Load from Control Register 

idest <— csrc2 

id.x isrc 1(isrc2) , idest Load Integer 

idest <r- mem.x (isrd + isrc2) 

lock Begin Interlocked Sequence 

Set BL in dirbase. The next load or store that misses the cache locks that location. 

Disable interrupts until the bus is unlocked. 

mov isrc2, idest Register-Register Move 

Assembler pseudo-operation 

mov isrc2, idest = shl rO, isrc2, idest 

mov const32, idest Constant-to-Register Move 

Assembler pseudo-operation 
adds l%const32, rO, idest 

. . . when const32 < 0x8000 

otY\ h%const32, rO, idest 
or l%const32, idest, idest 

. . . when const32 ^ 0x8000 


nop 

Assembler pseudo-operation 

nop = shl rO, rO, rO 

or isrd, isrc2, idest 

idest isrd OR isrc2 

CC set if result is zero, cleared otherwise 

orh * const, isrc2, idest 

idest <r- (# const shifted left 16 bits) OR isrc2 
CC set If result is zero, cleared otherwise 


Core-Unit No Operation 


Logical OR 


Logical OR High 


2-224 



i 860 TM XR MICROPROCESSOR 




inl^. 


pfadd.p fsrd, fsrc2, fdest Pipelined Floating-Point Add 

fdest <— last stage Adder result 
Advance A pipeline one stage 
A pipeline first stage <— fsrd + fsrc2 

pfaddp fsrd, fsrc2, fdest Pipelined Add with Pixel Merge 

fdest <r— last stage Graphics result 

last stage Graphics result fsrd + fsrc2 

Shift and load MERGE register from last stage Graphics result as defined in Table 8.2 

pfaddz fsrd, fsrc2, fdest Pipelined Add with Z Merge 

fdest last stage Graphics result 

last stage Graphics result fsrd + fsrc2 

Shift MERGE right 16 and load fields 31.. 16 and 63..48 from last stage Graphics result 

pfam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest last stage Adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage A-opI + A-op2 
M pipeline first stage M-opI x M-op2 

pfamov.r fsrd, fdest Pipelined Floating-Point Adder Move 

fdest <— last stage Adder result 
Advance A pipeline one stage 
A pipeline first stage fsrd 

pfeq.p fsrd, fsrc2, fdest Pipelined Floating-Point Equal Compare 

fdest last stage Adder result 

CC set if fsrd = fsrc2, else cleared 
Advance A pipeline one stage 

A pipeline first stage Is undefined, but no result exception occurs 

pfgt.p fsrd, fsrc2, fdest Pipelined Floating-Point Greather-Than Compare 

(Assembler clears R-bIt of instruction) 
fdest <r- last stage Adder result 
CC set if fsrd > fsrc2, else cleared 
Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 

pfiadd.w fsrd, fsrc2, fdest Pipelined Long-Integer Add 

fdest last stage Graphics result 

last stage Graphics result <r- fsrd + fsrc2 

pfisub.w fsrd, fsrc2, fdest Pipelined Long-Integer Subtract 

fdest <r— last stage Graphics result 

last stage Graphics result fsrd - fsrc2 

pfix.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest last stage Adder result 

Advance A pipeline one stage 

A pipeline first stage <r— 64-blt value with low-order 32 bits 
equal to integer part of fsrd rounded 

Pipelined Floating-Point Load 

pfid.z isrd(lsrc2), fdest (Normal) 

pfld.z isrd(isrc2) + + , fdest (Autoincrement) 

fdest mem.z (third previous pfid’s {isrd + isrc2)) 

(where .z is precision of third previous pfld.z) 

If autoincrement 

THEN isrc2 isrd + isrc2 

FI 



pfie.p fsrd, fsrc2, fdest Pipelined F-P Less-Than or Equal Compare 

Assembler pseudo-operation, identical to pfgtp except that 
assembler sets R-bit of instruction. 
fdest last stage Adder result 
CC clear if fsrd ^ fsrc2, else set 
Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 


2-225 


iSeOTM XR MICROPROCESSOR 




inl^. 


pfmam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest last stage Multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 - A-op2 
M pipeline first stage M-op1 x M-op2 

pfmov.r fsrd, fdest Pipelined Floating-Point Reg-Reg Move 

Assembler pseudo-operation 

pfmov.ss fsrd, fdest = pfiadd.ss fsrd, fO, fdest 
pfmov.dd fsrd, fdest = pfiadd.dd fsrd, fO, fdest 
pfmov.sd fsrd, fdest = pfamov.sd fsrd ^ fdest 
pfmov.ds fsrd, fdest = pfamov.ds fsrd, fdest 

pfmsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest last stage Multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage A-op1 - A-op2 
M pipeline first stage M-op1 x M-op2 

pfmul.p fsrd, fsrc2, fdest Pipelined Floating-Point Multiply 

fdest last stage Multiplier result 

Advance M pipeline one stage 
M pipeline first stage fsrd x fsrc2 

pfmulS.dd fsrd, fsrc2, fdest Three-Stage Pipelined Multiply 

fdest last stage Multiplier result 

Advance 3-Stage M pipeline one stage 
M pipeline first stage fsrd x fsrc2 

pform fsrd, fdest Pipelined OR to MERGE Register 

fdest <— last stage Graphics result 

last stage Graphics result fsrd OR MERGE 

MERGE ^ 0 

pfsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest last stage Adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 - A-op2 
M pipeline first stage M-op1 x M-op2 

pfsub.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract 

fdest last stage Adder result 

Advance A pipeline one stage 
A pipeline first stage <r- fsrd + fsrc2 

pftrunc.v fsrd, fdest Pipelined Floating-Point to integer Conversion 

fdest last stage Adder result 

Advance A pipeline one stage 

A pipeline first stage ^ 64-blt value with low-order 32 bits 
equal to integer part of fsrd 

pfzchki fsrd, fsrc2, fdest Pipelined 32-Bit Z-Buffer Check 

Consider fsrd, fsrc2, fdest, as arrays of two 32-bit 

fields fsrd(0)..fsrd('\), fsrc2(0)Jsrc2(^), and fdest(0).Jdest(^) 
where zero denotes the least significant field. 

PM PM shifted right by 2 bits 
FOR i = 0 to 1 
DO 

PM [i + 6] fsrc2{() ^ fsrdiji) (unsigned) 

fdestij^ last stage Graphics result 

last stage Graphics result smaller of fsrc2{j) and fsrcHj) 

OD 

MERGE ^ 0 
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pfzchks fsrd, fsrc2, fdest Pipelined 16-Blt Z-Buffer Check 

Consider fsrd, fsrc2, and fdest, as arrays of four 16-bit 

fields fsrd(0)Jsrd(2f), fsrc2(0)Jsrc2(2), and /afe5/(0)../iyes/(3) 
where zero denotes the least significant field. 

PM <— PM shifted right by 4 bits 
FOR i = 0 to 3 
DO 

PM [i + 4] fsrc2(\) ^ fsrd(\) (unsigned) 

fdest(i) last stage Graphics result 

last stage Graphics result smaller of fsrc2(\) and fsrd{\) 

OD 

MERGE ^ 0 


pst.d fdest, * const(isrc2) Pixel Store 

pst.d fdest, *const(isrc2)-^ Pixel Store Autoincrement 

Pixels enabled by PM in mem.d (isrc2 + * const) fdest 

Shift PM right by 8/plxel size (in bytes) bits 

IF autoincrement 

THEN isrc2 <— if const + isrc2 

FI 

shl isrd , isrc2, idest Shift Left 

idest <— isrc2 shifted left by isrd bits 



shr isrd, isrc2, idest Shift Right 

SC (in psr) isrd 

idest <r- isrc2 shifted right by isrd bits 

shra isrc 1, isrc2, idest Shift Right Arithmetic 

idest <r- isrc2 arithmetically shifted right by isrd bits 

shrd isrd, isrc2, idest Shift Right Double 

idest <— ■ low-order 32 bits of isrc1:isrc2 shifted right by SC bits 

st.c isrdni, csrc2 Store to Control Register 

csrc2 <r- isrdni 


stx isrdni, * const(isrc2) 

mem.x (isrc2 + *consf) isrdni 

subs isrd, isrc2, idest 

idest isrd - isrc2 

OF <r— (bit 31 carry ^ bit 30 carry) 

CC set if isrc2 > isrd (signed) 

CC clear If isrc2 ^ isrd (signed) 

subu isrd, isrc2, idest 

idest ^ isrd — isrc2 
OF <— NOT (bit 31 carry) 

CC bit 31 carry 

(i.e. CC set if isrc2 ^ isrd (unsigned) 

CC clear if isrc2 > isrd (unsigned) 

trap isrdni, isrc2, idest 

Generate trap with IT set in psr 

unlock 

Clear BL in dirbase. The next load or store unlocks the bus. 
Enable interrupts after bus is unlocked. 

xor isrd, isrc2, idest 

idest <r- isrd XOR isrc2 

CC set If result is zero, cleared otherwise 

xorh * const, isrc2, idest 

idest (#co/7s^ shifted left 16 bit) XOR isrc2 
CC set If result is zero, cleared otherwise 


. . .Store Integer 
Subtract Signed 


Subtract Unsigned 


Software Trap 

End Interlocked Sequence 

Logical Exclusive OR 

Logical Exclusive OR High 
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Table 8.2. FADDP MERGE Update 


Pixel 

Size 

(from PS) 

Fields Loaded From 
Result into MERGE 

Right Shift 
Amount 
(Field Size) 

8 

63..56, 47..40, 31..24, 15..8 

8 

16 

63..58, 47..42, 31. .26, 15..10 

6 

32 

63..56, 31. .24 

8 


8.2 Instruction Format and Encoding 

All instructions are 32 bits long and begin on a four- 
byte boundary. When operands are registers, the 
register encodings shown in Table 8.3 are used. 
There are two general core-instruction formats, 
REG-format and CTRL-format, as well as a separate 
format for floating-point instructions. 

8.2.1 REG-FORMAT INSTRUCTIONS 

Within the REG-format are several variations as 
shown in Figure 8.1. Table 8.4 gives the encodings 
for these instructions. One encoding is an escape 
code that defines yet another variation: the core es- 
cape instructions. Figure 8.2 shows the format of 
this group, and Table 8.5 shows the encodings. 

In these instructions, the src2 field selects one of 
the 32 integer registers (most instructions) or five 
control registers (st.c and Id.c). Dest selects one of 
the 32 integer registers (most instructipns) or float- 
ing-point registers (fid, fst, pfid, pst, Ixfr). For in- 
structions where src1 is optionally an immediate val- 
ue, bit 26 of the opcode (l-bit) indicates whether src1 
is an Immediate. If bit 26 is clear, an integer register 
is used; if bit 26 Is set, src1 is contained In the low- 
order 1 6 bits, except for bte and btne instructions. 
For bte and btne, the five-bit immediate value is 
contained In the src1 field. For st, bte, btne, and 
bla, the upper five bits of the offset or broffset are 
contained in the dest field instead of src1, and the 
lower 1 1 bits of offset are the lower 1 1 bits of the 
instruction. 


Table 8.3. Register Encoding 


Register 

Encoding 

rO 

0 

r31 

31 

fO 

0 

f31 

31 

Fault Instruction 

0 

Processor Status 

1 

Directory Base 

2 

Data Breakpoint 

3 

Floating-Point Status 

4 

Extended Process Status 

5 


For id and st, bits 28 and zero determine operand 
size as follows: 


Bit 28 

BitO 

Operand Size 

0 

0 

8-bits 

0 

1 

8-bits 

1 

0 

16-blts 

1 

1 

32-bits 


When srcl is an Immediate and bit 28 is set, bit zero 
of the immediate value is forced to zero. 


For fid, fst, pfId, pst, and flush, bit 0 selects autoin- 
crement addressing if set. For fid, fst, pfid, and 
pst, bits one and two select the operand size as 
follows: 


Biti 

Bit 2 

Operand Size 

0 

0 

64-bits 

0 

1 

128-bits 

1 

0 

32-bits 

1 

1 

32-bits 


When src1 Is an immediate value, bits zero and one 
of the immediate value are forced to zero to main- 
tain alignment. When bit one of the immediate value 
is clear, bit two is also forced to zero. 

For flush, bits one and two must be zero. 
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General Format 

31 25 20 15 10 0 


OPCODE/I 

SRC2 

DEST 

SRC1 

IMMEDIATE, OFFSET. OR NULL 

16-Bit Immediate Variant (except bte and btne) 

31 25 20 15 0 

OPCODE 

1 

SRC2 

DEST 

IMMEDIATE 

St, bla, bte, and btne 

31 25 20 15 10 0 

OPCODE/I 

SRC2 

OFFSET 

HIGH 

SRC1 

SRC1S 

OFFSET LOW 

bte and btne with 5-Bit Immediate 

31 25 20 15 10 0 

OPCODE 

1 

SRC2 

OFFSET 

HIGH 

IMMEDIATE 

OFFSET LOW 


Figure 8.1. REG-Format Variations 
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Table 8.4. REG-Format Opcodes 

31 26 


Id.x 

Load Integer 

0 

0 

0 

L 

0 

1 

st.x 

Store Integer 

0 

0 

0 

L 

1 

1 

ixfr 

Integer to F-P Reg Transfer 

0 

0 

0 

0 

1 

0 


(reserved) 

0 

0 

0 

1 

1 

0 

fid.x, fst.x 

Load/Store F-P 

0 

0 

1 

0 

LS 

1 

flush 

Flush 

0 

0 

1 

1 

0 

1 

pst.d 

Pixel Store 

0 

0 

1 

1 

1 

1 

Id.c, st.c 

Load/Store Control Register 

0 

0 

1 

1 

LS 

0 

bri 

Branch Indirect 

0 

1 

0 

0 

0 

0 

trap 

Trap 

0 

1 

0 

0 

0 

1 


(Escape for F-P Unit) 

0 

1 

0 

0 

1 

0 


(Escape for Core Unit) 

0 

1 

0 

0 

1 

1 

bte, btne 

Branch Equal or Not Equal 

0 

1 

0 

1 

E 

1 

pfid.y 

Pipelined F-P Load 

0 

1 

1 

0 

0 

1 


(CTRL-Format Instructions) 

0 

1 

1 

X 

X 

X 

addu, -s, subu, -s, 

Add/Subtract 

1 

0 

0 

so 

AS 

1 

shl, shr 

Logical Shift 

1 

0 

1 

0 

LR 

1 

shrd 

Double Shift 

1 

0 

1 

1 

0 

0 

bla 

Branch LCC Set and Add 

1 

0 

1 

1 

0 

1 

shra 

Arithmetic Shift 

1 

0 

1 

1 

1 

1 

and(h) 

AND 

1 

1 

0 

0 

H 

1 

andnot(h) 

ANDNOT 

1 

1 

0 

1 

H 

1 

or(h) 

OR 

1 

1 

1 

0 

H 

1 

xor(h) 

XOR 

1 

1 

1 

1 

H 

1 


(reserved) 

1 

1 

X 

X 

1 

0 


L Integer Length 

0 — 8 bits 

1 — 16 or 32 bits (selected by bit 0) 
LS Load /Store 

0 — Load 

1 —Store 
SO Signed/Ordinal 

0 — Ordinal 

1 — Signed 
H High 

0 —and, or, andnot, xor 

1 — andh, orh, andnoth, xorh 


AS Add/Subtract 

0 —Add 

1 —Subtract 
LR Left/Right 

0 —Left Shift 

1 —Right Shift 
E Equal 

0 — Branch on Not Equal 

1 —Branch on Equal 
I Immediate 

0 — src1 is register 

1 — src1 is immediate 


31 

26 


15 

10 5 

0 

0 10 0 1 

1 

reserved* 

SRC1 

reserved* 

OPCODE 


* reserved (must be set to zero by assemblers) 


Figure 8.2. Core Escape Instruction Format 
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Table 8.5. Core Escape Opcodes 

4 0 


(reserved) 

0 

0 

0 

0 

0 

lock Begin Interlocked Sequence 

0 

0 

0 

0 

1 

call! Indirect Subroutine Call 

0 

0 

0 

1 

0 

(reserved) 

0 

0 

0 

1 

1 

intovr Trap on Integer Overflow 

0 

0 

1 

0 

0 

(reserved) 

0 

0 

1 

0 

1 

(reserved) 

0 

0 

1 

1 

0 

unlock End Interlocked Sequence 

0 

0 

1 

1 

1 

(reserved) 

0 

1 

X i 

X 

X 

(reserved) 

1 

0 

X 

X 

X 

(reserved) 

1 

1 

X 

X 

X 


8.2.2 CTRL-FORMAT INSTRUCTIONS 

The CTRL instructions do not refer to registers, so instead of the register fields, they have a 26-bit relative 
branch offset. Figure 8.3 shows the format of these instructions and Table 8.6 defines the encodings. 


31 



28 

25 

0 

0 

1 

1 

OPC 

BROFFSET 


BROFFSET is a signed 26-bit relative branch offset. 


Figure 8.3. CTRL Instruction Format 


Table 8.6. CTRL-Format Opcodes 

28 26 



(reserved) 

0 

0 

0 


(reserved) 

0 

0 

1 

br 

Branch Direct 

0 

1 

0 

call 

Call 

0 

1 

1 

bc(.t) 

Branch on CC Set 

1 

0 

T 

bnc(.t) 

Branch on CC Clear 

1 

1 

T 


T Taken 

0 — be or bnc 

1 — bc.t or bnc.t 
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8.2.3 FLOATING-POINT INSTRUCTIONS 

The floating-point instructions also constitute an escape series. All these instructions begin with the bit se- 
quence 010010. Figure 8.4 shows the format of the floating point instructions, and Table 8.7 gives the encod- 
ings. Within the dual-operation instructions is a subcode DPC whose values are given in Table 8.8 along with 
the mnemonic that corresponds to each. 


31 

25 

20 

15 




7 

0 

0 10 0 10 

SRC2 

DEST 

SRCI 

P 

D- 

S 

R 

OPCODE 


SRC1 , SRC2 —Source; one of 32 floating-point registers 
DEST — Destination register 

(instructions other than fxfr) one of 32 floating-point registers 
(fxfr) one of 32 integer registers 

S Source Precision 

1 — Double-precision source operands 

0 — Single-precision source operands 
R Result Precision 

1 — Double-precision result 
0 — Single-precision result 


Figure 8.4. Floating-Point Instruction Encoding 

Table 8.7. Floating-Point Opcodes 

6 0 


pfam 

pfmam 

pfsm 

pfmsm 

Add and Multiply* 

Multiply with Add* 

Subtract and Multiply* 

Multiply with Subtract* 

0 

0 

0 

0 

0 

1 

DPC 

DPC 

(p)fmul 

Multiply 

0 

1 

0 

0 

0 

0 

0 

fmlow 

Multiply Low 

0 

1 

0 

0 

0 

0 

1 

frcp 

Reciprocal 

0 

1 

0 

0 

0 

1 

0 

frsqr 

Reciprocal Square Root 

0 

1 

0 

0 

0 

1 

1 

pfmulS.dd 

3-Stage Pipelined Multiply 

0 

1 

0 

0 

1 

0 

0 

(p)fadd 

Add 

0 

1 

1 

0 

0 

0 

0 

(p)fsub 

Subtract 

0 

1 

1 

0 

0 

0 

1 

(p)fix 

Fix 

0 

1 

1 

0 

0 

1 

0 

(p)famov 

Adder Move 

0 

1 

1 

0 

0 

1 

1 

pfgt/pfle** 

Greater Than 

0 

1 

1 

0 

1 

0 

0 

pfeq 

Equal 

0 

1 

1 

0 

1 

0 

1 

(p)ftrunc 

Truncate 

0 

1 

1 

1 

0 

1 

0 

fxfr 

Transfer to Integer Register 

1 

0 

0 

0 

0 

0 

0 

(p)fiadd 

Long-Integer Add 

1 

0 

0 

1 

0 

0 

1 

(p)fisub 

Long-Integer Subtract 

1 

0 

0 

1 

1 

0 

1 

(p)fzchkl 

Z-Check Long 

1 

0 

1 

0 

1 

1 

1 

(p)fzchks 

Z-Check Short 

1 

0 

1 

1 

1 

1 

1 

(p)faddp 

Add with Pixel Merge 

1 

0 

1 

0 

0 

0 

0 

(p)faddz 

Add with Z Merge 

1 

0 

1 

0 

0 

0 

1 

(p)form 

OR with MERGE Register 

1 ! 

0 

1 

1 

0 

1 

0 


*pfam and pfsm have P-bit set; pfmam and pfmsm have P-bit clear. 
**pfgt has R bit cleared; pfie has R bit set. 

NOTE: 

All opcodes not shown are reserved. 


P Pipelining 

1 — Pipelined instruction mode 

0 — Scalar instruction mode 
D Dual-Instruction Mode 

1 -Dual-instruction mode . 

0 — Single-instruction mode 
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The following table shows the opcode mnemonics that generate the various encodings of DPC and explains 
each encoding. 


Table 8.8. DPC Encoding 


DPC 

PFAM 

PFSM 

M-Unit 

M-Unit 

A-Unit 

A-Unit 

T 

K 

Mnemonic 

Mnemonic 

opi 

op2 

opi 

op2 

Load 

Load* 

0000 

r2p1 

r2s1 

KR 

src2 

srcl 

M result 

No 

No 

0001 

r2pt 

r2st 

KR 

src2 

T 

M result 

No 

Yes 

0010 

r2ap1 

r2as1 

KR 

src2 

srcl 

A result 

Yes 

No 

0011 

r2apt 

r2ast 

KR 

src2 

T 

A result 

Yes 

Yes 

0100 

i2p1 

i2s1 

Kl 

src2 

srcl 

M result 

No 

No 

0101 

i2pt 

i2st 

Kl 

src2 

T 

M result 

No 

Yes 

0110 

i2ap1 

i2as1 

Kl 

src2 

srcl 

A result 

Yes 

No 

0111 

i2apt 

i2ast 

Kl 

src2 

T 

A result 

Yes 

Yes 

1000 

rat1p2 

rat1s2 

KR 

A result 

srcl 

src2 

Yes 

No 

1001 

m12apm 

m12asm 

srcl 

src2 

A result 

M result 

No 

No 

1010 

ra1p2 

ra1s2 

KR 

A result 

srcl 

src2 

No 

No 

1011 

m12ttpa 

m12ttsa 

srcl 

src2 

T 

A result 

Yes 

No 

1100 

iat1p2 

iat1s2 

Kl 

A result 

srcl 

src2 

Yes 

No 

1101 

m12tpm 

m12tsm 

srcl 

src2 

' T 

M result 

No 

No 

1110 

ia1p2 

ia1s2 

Kl 

A result 

srcl 

src2 

No 

No 

1111 

m12tpa 

m12tsa 

srcl 

src2 

T 

A result 

No 

No 






DPC 

PFMAM 

PFMSM 

M-Unit 

M-Unit 

A-Unit 

A-Unit 

T 

K 

Mnemonic 

Mnemonic 

opi 

op2 

opi 

op2 

Load 

Load* 

0000 

mr2p1 

mr2s1 

KR 

src2 

srcl 

M result 

No 

No 

0001 

mr2pt 

mr2st 

KR 

src2 

T 

M result 

No 

Yes 

0010 

mr2mp1 

mr2ms1 

KR 

src2 

srcl 

M result 

Yes 

No 

0011 

mr2mpt 

mr2mst 

KR 

src2 

T 

M result 

Yes 

Yes 

0100 

mi2p1 

mi2s1 

Kl 

src2 

srcl 

M result 

No 

No 

0101 

mi2pt 

mi2st 

Kl 

src2 

T 

M result 

No 

Yes 

0110 

mi2mp1 

mi2ms1 

Kl 

src2 

srcl 

M result 

Yes 

No 

0111 

mi2mpt 

mi2mst 

Kl 

src2 

T 

M result 

Yes 

Yes 

1000 

mrmt1p2 

mrmt1s2 

KR 

M result 

srcl 

src2 

Yes 

No 

1001 

mm12mpm 

mm12msm 

srcl 

src2 

M result 

M result 

No 

No 

1010 

mrm1p2 

mrm1s2 

KR 

M result 

srcl 

src2 

No 

No 

1011 

mm12ttpm 

mm12ttsm 

src1 

src2 

T 

A result 

Yes 

No 

1100 

mimt1p2 

mimt1s2 

Kl 

M result 

srcl 

src2 

Yes 

No 

1101 

mm12tpm 

mm12tsm 

srcl 

src2 

T 

M result 

No 

No 

1110 

mim1p2 

mim1s2 

Kl 

M result 

srcl 

src2 

No 

No 

1111 




Intel-Reserved 





*lf K-load is set, KR is loaded when operand-1 of the multiplier is KR; Kl is loaded when operand-1 of the multiplier is Kl. 
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8.3 Instruction Timings 

i860 XR microprocessor instructions take one clock 
to execute unless a freeze condition is invoked. 
Freeze conditions and their associated delays are 

shown in the table below. Freezes due to multiple 
simultaneous cache misses result in a delay that is 
the sum of the delays for processing each miss by 
itself. Other multiple freeze conditions usually add 
only the delay of the longest individual freeze. 

Freeze Condition 

Delay 

Instruction-cache miss 

Number of clocks to read instruction (from ADS 
clock to first READY # clock) plus time to last 
READY# of block when jump or freeze occurs 
during miss processing plus two clocks if data- 
cache being accessed when instruction-cache 
miss occurs. 

Reference to destination of Id Instruction that 
misses 

One plus number of clocks to read data (from 

ADS# clock to first READY# clock) minus number 
of instructions executed since load (not counting 
instruction that references load destination) 

fid miss 

One plus number of clocks until first READY # 
returned (for 32- or 64-bit read cycles) or until 
second READY# returned (for 128-bit fid.q read 
cycles) 

call, cam, Ixfr, fxfr, Id.c, or st.c and data cache 
load miss processing In progress 

One plus number of clocks until first READY # 
returned (for 64-bit read cycles) or until second 
READY# returned (for 128-bit fid.q read cycles) 

Id/st/pfId/fId/fst and data cache load miss 
processing In progress 

One plus number of clocks until last READY# 
returned 

Reference to dest of Id, call, call!, fxfr, or Id.c in 
the next instruction. (Dest of call and call! Is r 1 .) 

One clock 


\ 
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Freeze Condition 

Delay 

Reference to dest of fid/pfid/ixfr in the next two 
instructions 

Two clocks in the first instruction; one In the 
second Instruction 

bc/bnc/bc.t/bnc.t following addu/adds/subu/ 
subs/pfeq/pfle/pfgt 

One clock 

Fsrd of multiplier operation refers to result of 
previous operation 

One clock 

Floating-point operation or graphics-unit 
instruction or fst, and scalar operation in progress 
other than frcp or frsqr 

If the scalar operation is fadd, fix, fmlow, fmul.ss, 
finuLsd, ftrunc, or fsub, two minus the number of 
instructions (or dual-mode pairs) already executed 
after the scalar operation. If the scalar operation is 
fmul.dd, three minus the number of instructions 
(or dual-mode pairs) executed after it. Add one if 
either or both of these two situations occur: 

1 . There is an overlap between the result register 
of the previous scalar operation and the source 
of the floating-point operation, and the 
destination precision of the scalar operation is 
different than the source precision of the 
floating-point operation. 

2. The floating-point operation is pipelined and its 
destination Is not fO. 

There is no delay if the result is negative. 

Multiplier operation preceded by a double 
precision multiply 

One clock 

TLB miss 

Five plus the number of clocks to finish two reads 
plus the number of clocks to set A-bits (if 
necessary) 

pfid when three pfid’s are outstanding 

One plus the number of clocks to return data from 
first pfid 

pfid hits in the data cache 

Two plus the number of clocks to finish ail 
outstanding accesses 

St, pst or fst miss, Id miss, or flush with modified 
block when store path full (two stores or one 256- 
bit write-back internally waiting for bus plus 
external bus pipeline full) I 

One plus the number of clocks until READY # 
active on next 64-bit write cycle or second 

READY # of next 1 28-bit write cycle. 

• 

Id, fid, pfid, St, pst, or fst when address path full 
(one address internally waiting for bus plus 
external bus pipeline full) 

Number of clocks until next nonrepeated address 
can be Issued (i.e., an address that is not the 2nd- 
4th cycle of a cache fill, the 2nd -8th cycle of a 

CS8 mode instruction fetch, nor the 2nd cycle of a 

1 28-bit write) 

Id/fid following st/fst hit 

One clock 
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Freeze Condition 

— 

Delay 

Delayed branch not taken 

One clock 

Nondelayed branch taken: 


be, bnc 

One clock 

bte, btne 

Two clocks 

Indirect branch bri or call calii 

One clock 

st.c 

Two clocks 

Result of graphics-unit instruction (other than 
fmov.dd) used in next instruction when the next 
instruction is an adder- or multiplier-unit instruction 

One clock 

Result of graphics-unit instruction used In next 
instruction when the next instruction is a graphics- 
unit Instruction 

One clock 

flush followed by flush 

Three clocks minus the number of instructions 
between the two flush instructions. There Is no 
delay if the result is negative. 

fst or pst followed by pipelined floating-point 
operation that overwrites the register being stored 

One clock 


8.4 Instruction Characteristics 

The following table lists some of the characteristics 
of each instruction. The characteristics are: 

• What processing unit executes the instruction. 
The codes for processing units are: 

A Floating-point adder unit 

E Core execution unit 

G Graphics unit 
M Floating-point multiplier unit 

• Whether the instruction is pipelined or not. A P 
indicates that the instruction is pipelined. 

• Whether the instruction is a delayed branch in- 
struction. A D marks the delayed branches. 

• Whether the instruction changes the condition 
code CC. A CC marks those instructions that 
change CC. 

• Which faults can be caused by the instruction. 
The codes used for exceptions are: 

IT Instruction Fault 

SE Floating-Point Source Exception 

RE Floating-Point Result Exception, including 

overflow, underflow, inexact result 
DAT Data Access Fault 

Note that this is not the same as specifying at 
which Instructions faults may be reported. A re- 
sult exception is reported on the subsequent 
floating-point instruction, pst, fst, or sometimes 
fid, pfid, and ixfr. 


The Instruction access fault lAT and the Interrupt 
trap IN are not shown in the table because they 
can occur for any instruction. 

• Performance notes. These comments regarding 
optimum performance are recommendations 
only. If these recommendations are not followed, 
the I860 XR microprocessor automatically waits 
the necessary number of clocks to satisfy internal 
hardware requirements. The following notes de- 
fine the numeric codes that appear in the instruc- 
tion table: 

1 . The following Instruction should not be a con- 
ditional branch (be, bnc, bc.t, or bnc.t). 

2. The destination should not be a source oper- 
and of the next two Instructions. 

3. A load should not directly follow a store that Is 
expected to hit in the data cache. 

4. When the prior instruction is scalar, fsret 
should not be the same as the fdest of the 
prior operation. 

5. The fdest should not reference the destination 
of the next instruction if that instruction is a 
pipelined floating-point operation. 

6. The destination should not be a source oper- 
and of the next instruction. (For call and call!, 
the destination Is r1.) 
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7. When the prior operation is scalar and multipli- 
er opi is fsrcl, fsrc2 should not be the same 
as the fdest of the prior operation. 

8. When the prior operation is scalar, fsrd and 
fsrc2 of the current operation should not be the 
same as fdest of the prior operation. 

9. A pfid should not immediately follow a pfid. 

• Programming restrictions. These indicate combi- 
nations of conditions that must be avoided by 
programmers, assemblers, and compilers. The 
following notes define the alphabetic codes that 
appear in the instruction table: 

a. The sequential instruction following a delayed 
control-transfer instruction may not be another 
control-transfer instruction (except in the case 
of external interrupts), nor a trap instruction, 
nor the target of a control-transfer instruction. 

b. When using a bri to return from a trap handler, 
programmers should take care to prevent traps 
from occurring on that or on the next sequen- 
tial instruction. IM should be zero (Interrupts 
disabled) when the bri is executed. 

c. If fdest is not zero, fsrd must not be the same 
as fdest. 

d. When fsrd goes to the multiplier op1, KR, or 
Kl, fsrd must not be the same as fdest. 

e. If fdest is not zero, fsrd and fsrc2 must not be 
the same as fdest. 

f. isrd must not be the same as isrc2 for the 
autoincrementing form of this instruction. 

g. isrd must not be the same as isrc2. 

• Core and Floating-Point Instruction Interaction in 
Dual-Instruction Mode 

1 . If one of the branch-on-condition instructions 
be or bnc is paired with a floating-point com- 
pare, the branch tests the value of the condi- 
tion code prior to the compare. 


2. If an ixfr, fid, or pfid loads the same register 
as a source operand in the floating point in- 
struction, the floating-point instruction refer- 
ences the register value before the load up- 
dates It. 


3. An fst or pst that stores a register that is the 
destination register of the companion pipe- 
lined floating-point operation will store the re- 
sult of the companion operation. 

4. When the core instruction sets CC and the 
floating-point Instruction Is pfgt, pfie, or pfeq, 
CC Is set according to the result of pfgt, pfIe, 
or pfeq. 


5. 


When a trap instruction causes a trap in dual- 
instruction mode, the floating-point instruction 
has neither completed execution nor has up- 
dated the FT bit or any result status bits. This 
is not a problem when the trap is inserted by a 
debugger, because the trap is replaced by the 
original instruction, and the dual-mode pair is 
reexecuted. However, when the trap Is pro- 
grammed, the trap handler must avoid reexe- 
cuting the trap by returning to user code at 
the address In fir + 8. In this case, the trap 
handler must emulate the floating-point In- 
struction before returning to the user code. 
Emulation of the Instruction must include all 
side-effects (for example, the effect of its 
D-bit, effect on the pipelines, and effect on FT 
and result-status bits), just as if the instruction 
had been executed by the processor in the 
original context. 



6. In dual-instruction mode, when the intovr in- 
struction causes a trap, the floating-point com- 
panion Instruction has completely finished ex- 
ecution before the trap Is taken. 
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Programming Restrictions for Dual-Instruction 

Mode 

1 . The result of placing a core instruction in the 
low-order 32 bits or a floating-point instruction 
in the high-order 32 bits is not defined (except 
for shrd rO, rO, rO which is interpreted as 

fnop). 

2. A floating-point instruction that has the D-bit 
set must be aligned on a 64-bit boundary (i.e., 
the three least-significant bits of its address 
must be zero). This applies as well to the Initial 
32-bit floating-point instruction that triggers 
the transition Into dual-instruction mode, but 
does not apply to the following instruction. 

3. When the floating-point operation is scalar 
and the core operation is fst or pst, the store 
should not reference the result register of the 
floating-point operation. When the core opera- 
tion is pst, the floating-point instruction can- 
not be (p)fzchks or (p)fczhkl. 

4. When the core Instruction of a dual-mode pair 
is a control-transfer operation and the previ- 
ous instruction had the D-bit set, the floating- 
point Instruction must also have the D-bIt set. 
In other words, an exit from dual-instruction 
mode cannot be initiated (first instruction pair 
without D-bit set) when the core instruction is 
a control-transfer instruction. 

5. When the core operation is a Id.c or st.c, the 
floating-point operation must be d.fnop. 

6. When the floating-point operation is fxfr, the 
core instruction cannot be Id, Id.c, st, st.c, 
call ixfr, or any instruction that updates an in- 
teger register (including autoincrement index- 
ing). Furthermore, the core instruction cannot 
be a fid, fst, pst, or pfid that uses as isrd or 
isrc2 the same register as the idest of the 
fxfr. Additionally, in dual instruction mode. 


fxfr may not be used in a branch delay slot if 
its destination register is referenced by the 
preceding branch instruction. 

7. A bri must not be executed in dual-instruction 
mode if any trap bits are set. 

8. When the core operation is bc.t or bnc.t, the 
floating point operation cannot be pfeq or 
p*gt- The floating-point operation in the se- 
quentially following Instruction pair cannot be 
pfeq or pfgt, either. 

9. A transition to or from dual-instruction mode 
cannot be Initiated on the instruction following 

a bri. 

1 0. An ixfr, fid, or pf Id cannot update the desti- 
nation of the companion floating-point in- 
struction (unless the destination is fO or f1) 
or of the following pipelined floating-point in- 
struction (regardless of its destination regis- 
ter). No overlap of register destinations is 
permitted; for example, the following instruc- 
tions must not be paired: 

// Illegal case 1 

d.fmul.ss f9, flO, f5 
fld.d address, f4 
; Overlaps f5 

// Illegal case 2 

d.fmul.ss fO, fO, f3 
fld.q address, fO 
; Overlaps f3 

// Illegal case 3 

d.fmul.ss f9, flO, fll 
fld.l address, f5 
d.pfadd.ss fx, fx, f4 

; Overlaps f5, if last 
stage result is double- 
precision 

1 . During a locked sequence, a transition to or 
from dual-instruction mode Is not permitted. 
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Table 8.9 Instruction Characteristics 


Instruction 

Execution 

Pipelined? 

Sets 

Faults 

Performance 

Programming 

Unit 

Delayed? 

CC? 

Notes 

Restrictions 

adds 

E 


cc 


1 


addu 

E 


CC 


1 


and 

E 


cc 




andh 

E 


cc 




andnot 

E 


cc 




andnoth 

E 


cc 




be 

E 






bc.t 

E 

D 




a 

bla 

E 

D 




a.g 

bnc 

E 






bnc.t 

E 

D 




a 

br 

E 

D 




a 

bri ' 

E 

D 




a, b 

bte 

E 






btne 

E 






call 

E 

D 



6 

a 

calli 

E 

D 



6 

a 

fadd.p 

A 



SE, RE 



faddp 

G 




8 


faddz 

. G 




8 


famov.r 

A 



SE. RE 



fiadd.z 

G 




8 


fisub.z 

G 




8 


fix.p 

A 



SE, RE 



fid.y 

E 



DAT 

2,3 

f 

flush 

E 






fmlow.p 

M 



! 

4 


fmul.p 

M 



SE,RE 

4 


form 

G 




8 


frep.p 

M 



SE, RE 



frsqr.p 

M 

1 


SE,RE 



fst.y 

E 



DAT 

5 

f 

fsub.p 

A 



SE, RE 



ftrunc.p 

A 



SE, RE 



fxfr 

G 




6,8 


fzchki 

G 




8 


fzchks 

G 




8 


intovr 

E 



IT 



ixfr 

E 




2 


Id.c 

E 






ld.x 

E 



DAT 

6 


or 

E 


cc 




orh 

E 


cc 




pfadd.p 

A 

P 


SE, RE 



pfaddp 

G 

P 



8 

e 
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Table 8.9 Instruction Characteristics (Continued) 


Instruction 

Execution 

Pipelined? 


Faults 

Performance 

Programming 

Unit 

Delayed? 


Notes 

Restrictions 

pfaddz 

G 

P 



8 

e 

pfam.p 

A&M 

P 



7 

d 

pfamov.r 

A 

P 

WM 




pfeq.p 

A 

P 



1 


pfgt.p 

A 

P 



1 


pfiadd.z 

G 

P 



8 

e 

pfisub.z 

G 

P 



8 

e 

pfix.p 

A 

P 


SE, RE 



pfld.z 

E 

P 


DAT 

2,9 

f 

pfmam.p 

A&M 

P 


SE, RE 

7 

d 

pfmsm.p 

A&M 

P 


SE, RE 

7 

d 

pfmul.p 

M 

P 


SE. RE 

4 

c 

pfmulS.dd 

M 

P 


SE, RE 

4 

c 

pform 

G 

P 



8 

e 

pfsm.p 

A&M 

P 


SE,RE 

7 

d 

pfsub.p 

A 

P 


SE, RE 



pftrunc.p 

A 

P 


SE, RE 



pfzchki 

G 

P 



8 


pfzchks 

G 

P 



8 


pst.d 

E 



DAT 


f 

shl 

E 






shr 

E 






shra 

E 






shrd 

E 






st.c 

E 






st.x 

E 



DAT 



subs 

E 


cc 


1 


subu 

E 


cc 


1 


trap 

E 



IT 



xor 

E 


cc 




xorh 

E 


cc 





DATA SHEET REVISION REVIEW 

The following list represents the key differences be- 
tween version 002 and version 001 of the i860 XR 

Microprocessor Data Sheet. 

1. Big-endian description in section 2.3 has been 
expanded. 

2. Bit 1 7 of the Extended Processor Status Regis- 
ter (EPSR) is the I NT bit which reflects the value 
on the interrupt pin (INT), as described in sec- 
tion 2.2.4 entitled “EXTENDED PROCESSOR 
STATUS REGISTER”. This is a documentation 
update only. 

3. The cacheability of a page is controlled by 
NOR’ing the value of the CD, WT bits and the 


KEN#, input pin, as described in section 2.5 en- 
titled “Caching and Cache Flushing” and sec- 
tion 3.1.14 entitled “Cache Enable (KEN#)”. 
This is a documentation update only. 

4. The NOTE section in section 2.5 entitled “Cach- 
ing and Cache Flushing” has been updated to 
clarify the paging requirement on changing the 
DTB field in the dirbase register. 

5. Information on register encoding is added in 
section 8.2 entitled “Instruction Format and En- 
coding”. This is a documentation update only. 

The following list represents the key differences be- 
tween version 003 and version 002 of the i860 XR 
Microprocessor Data Sheet. 
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Specification Changes: 

1 . Specification changes for improved AC perform- 
ance are in section 7.3. 

2. HOLD is acknowledged during locked bus cy- 
cles. See section 3.1.8. 

3. Additional paths have been added to the bus 
state diagram to allow direct transitions from 
states T 1 2 and T1 1 to state TH. See Figures 4.1 
and 4.10. 

4. Two new instructions, (p)famov.r, have been 
added. These replace (p)fadd.ds and 
(p)fadd.sd in the assembler pseudo-ops 
(p)fmov.r. These changes are In section 8.1 
and tables 2.7, 8.7, and 8.9. 

Documentation Changes: 

1. Big and little endian description has been ex- 
panded in sections 2.2.2, 2.3, and Figure 2.8. 

2. The actions and explanations of the lock, un- 
lock, and st.c dirbase changing the BL bit have 
been updated in sections 2.2.4, 3.1.5, 3.1.8, 
4.3.4, 4.3.5, and 8.1. 

3. The explanation of the AA and MA bits of the 
fpsr have been expanded In section 2.2.8. 

4. The explanation of the WT bit of the Page Table 
Entries has been expanded in sections 2.4.4.4 
and 2.5. 

5. A change concerning the locking of the bus dpr- 
ing address translation Is explained In sections 
2.4.5 and 2.8.5. 

6. A further explanation on when to flush the data 
cache is given in section 2.5. 

7. The explanation of the floating point multiplier 
pipeline has been expanded in section 2.6.1. 

8. The explanation of BREQ has been expanded 
in section 3.1.4 and Figure 4.1. 

9. The explanation of result exceptions has been 
expanded in sections 2.8 and 3.2. 

10. Instruction fetch identification has been clarified 
in section 3.1.6 and table 3.2. 

11. Bus cycle diagrams In Figures 4.7, 4.8, and 4.10 
have been clarified/corrected. 

12. Precision specification .r has been added to 
section 8.0 and table 8.1. 

13. In section 8.4, performance note 9 has been 
added, programming restriction d has been 
changed, and programming restriction f has 
been added. Table 8.9 has been updated to re- 
flect these changes. 

14. The description of testability has changed in 
sections 3.3. and 3.3.2. RESET and HOLD must 
be asserted by the tester to force the chip out- 
puts to float (tri-state). 


The following list represents the major differences 

between version 004 and version 003 of the i860 XR 

Microprocessor Data Sheet: 

Section 2.2.4 The explanation of the WP bit of the 
espr has been expanded. 

Section 2.8.2 More Information on the instruction 
trap has been added. 

Section 2.8.4 The instruction access trap has been 
clarified. 

Section 2.8.7 The values of registers after a reset 
trap have been specified. 

Section 3.1.4 BREQ timing has been clarified. 

Section 3.1.5 The calculation of interrupt latency 
has bee corrected. 

Section 3.1.6 The description of the byte-enable 
signals has been expanded. 

Section 3.1.8 The relation between the lock 
Instruction and the LOCK# signal has 
been clarified. The BL bit should no 
longer be changed by writing to the 
dirbase register. 

Section 6.0 The thermal specifications have been 
updated. 

Section 7.3 The A.C. Characteristics for CLK have 
changed. 

Section 7.3 Advance timing Information for the 50 
MHz clock rate has been added. 
These timings are subject to change 
without notice. 

Section 8.0 The operand naming conventions 
have improved. 

Section 8.2.1 The encoding of the flush Instruction 
has been corrected. 

Section 8.3 The data-dependent multiplier freeze 
has been eliminated. Other freeze 
conditions have been corrected or 
clarified. 

The following list represents the major differences 

between version 005 and version 004 of the i860 XR 

Microprocessor Data Sheet. 

Section 2.2.4 OF bit is writable only in supervisor 
mode using ST.C. 

Section 3.1.1 CLK rate has been updated. 

Section 5.0 Figure 5.3 has been corrected. 

Section 6.0 More information on measuring case 
temperature has been added. 

Section 6.0 Figure 6.1 has been updated to in- 
clude 25 MHz. 

Section 6.0 Table 6.1 has been corrected. 

Section 6.0 Table 6.2 has been updated to In- 
clude 25 MHz. 
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Section 7.2 

Section 7.3 
Section 7.3 




The D.C. Characteristics have been 
updated to include 25 MHz power sup- 
ply current. 

The A.C. Characteristics for CLK have 
been changed. 

50 MHz clock rate has been deleted. 


Section 7.3 25 MHz A.C. Specifications have been 
added. 


Section 7.3 Figure 7.1 has been corrected. 

Section 8.3 The data-dependent multiplier round- 
ing freeze has been eliminated. 

Section 8.4 Programming restrictions for dual-in- 
struction mode are added. 
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82495XP CACHE CONTROLLER/ 
82490XP CACHE RAM 


■ Two-Way, Set Associative, Secondary 
Cache for i860™ XP Microprocessor 

■ 50 MHz “No Glue” Interface with CPU 

■ Configurable 

— Cache Size 256 or 512 Kbytes 
— Line Width 32, 64 or 128 Bytes 
— Memory Bus Width 64 or 128 Bits 

■ Dual-Ported Structure Permits 
Simultaneous Operations on CPU and 
Memory Buses 

■ Efficient MRU Way Prediction 
— Zero Wait States on MRU Hit 
— One Wait State on MRU Miss 

■ Dynamically Selectable Update Policies 
— Write-Through 

— Write-Once 
— Write-Back 


■ MESI Cache Consistency Protocol 

■ Hardware Cache Snooping 

■ Maintains Consistency with Primary 
Cache via Inclusion Principle 

■ Flexible User-Implemented Memory 
Interface Enables Wide Range of 
Product Differentiation 

— Clocked or Strobed 
— Synchronous or Asynchronous 
— Pipelining 
— Memory Bus Protocol 

■ 82495XP Cache Controller Available in 
208-Lead Ceramic Pin Grid Array 
Package 

■ 82490XP Cache RAM Available in 84- 
Lead Plastic Quad Flatpack Package 

(See Packaging Handbook, Order #240800) 


The Intel 82495XP cache controller and 82490XP cache RAM, when coupled with a user-implemented memo- 
ry bus controller, provide a second-level cache subsystem that eliminates the memory latency and bandwidth 
bottleneck for a wide range of multiprocessor systems based on the i860 XP microprocessor. The CPU 
interface is optimized to serve the i860 XP microprocessor with zero wait states at up to 50 MHz. A secondary 
cache built from the 82495XP and 82490XP isolates the CPU from the memory subsystem; the memory can 
run slower and follow a different protocol than the i860 XP microprocessor. 
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Figure 0-1. Secondary Cache Configuration 


Intel, intel, and i860 are trademarks of Intel Corporation. 

Intel Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in an Intel product. No other circuit patent 
licenses are implied. Information contained herein supersedes previously published specifications on these devices from Intel. June 1991 
© INTEL CORPORATION. 1 991 2-243 Order Number: 240956-001 
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Figure 1-3. 82490XP Pinout (Top View) 
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Figure 1-4. 82490XP Pinout (Bottom View) 
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1.1 Pin Cross Reference T ables 


Table 1-1. 82495XP Pin Cross Reference by Name 


Signal 

Location 

Signal 

Location 

Signal 

Location 

ADS# 

B15 

AHOLD 

A17 

BGT# 

M03 

BLAST# 

CIS 

BLE# 

Cl 6 

BOFF#[CLENO] 

G15 

BRDY# 

P01 

BRDYC1 # 

D1S 

BRDYC2# 

F14 

BUS# 

P16 

CACHE# 

G14 

CADS# 

E03 

CAHOLD 


CDC# 


CDTS# 


CFAO 


CFA1 

B14 

CFA2# 

D06 

CFA3 

B02 

CFA4 

A16 

CFAS 

E14 

CFA6 

D14 

CLK 

D11 

CMIO# 

D04 

CNA#[CFG0] 

L04 

CRDY#[SLFTST#] 

M02 

CWAY 

J03 

CWR# 

E04 

DC# 


DRCTM# 

M01 

EADS# 

J1S 

FLUSH# [NCPFLD#] 

N04 

FPFLD#[FPFLDEN] 

J04 

FSIOUT# 


HITM#[CPUTYP] 

D17 

INVICLEN1] 

K1S 

KEN# 


KLOCK# 

C03 

KWEND#[CFG2] 

M04 

LEN 


LOCK# 

B16 

MALE[WWOR#] 

Q02 

MAOE# 


MAWEA# 

017 

MBALE[HIGHZ#] 

P04 

MBAOE# 

P06 

MCACHE# 


MCFAO 

Q16 

MCFA1 


MCFA2 

R04 

MCFA3 

006 

MCFA4 


MCFAS 

P14 

MCFA6 

P13 

MCYC# 

PI 7 

MHITM# 

H04- 

MIO# 

F16 

MKEN# 

R01 

MRO# 

J01 . 

MSETO 

01S 

MSET1 

P12 

MSET10 

Oil 

MSET2 

P11 

MSET3 


MSET4 


MSETS 

msm 

MSET6 

R17 

MSET7 

S17 

MSET8 


MSET9 


MTAGO 

Q10 

MTAG1 

P09 

MTAG10 

Q07 

MTAG11 


MTAG2 

009 

MTAG3 


MTAG4 

008 

MTAGS 

R1S 

MTAG6 

S14 

MTAG7 


MTAG7 

SI 7 

MTAG8 

P08 

MTAG9 

S16 

MTHIT# 

G03 

MWBWT# 

mm 

NA# 

J17 

NENE# 

DOS 

PALLC# 


PCD 




PWT 

Cl 7 

RDYSRC 

C01 


OOS 

SETO 

D13 

SET1 


SET10 

A09 

SET2 

C14 

SET3 

B12 



SETS 

C11 

SET6 

D12 


D09 

SETS 

DIO 

SET9 

B09 

SMLN# 

C06 

SNPADS# 

F03 

SNPBSY# 

F01 

SNPCLKLSNPMD] 

S03 

SNPCYC# 

H03 

SNPINV 

POS 

SNPNCA 

003 
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Table 1-1. 82495XP Pin Cross Reference by Name (Continued) 


Signal Location 

Signal Location 

Signal Location 

SNPSTB# R03 

SWEND#[CFG1] Q01 

SYNC#[MEMLDRV] Q04 

TAGO COS 

TAG1 A04 

TAG 10 B01 

TAG 11 C05 

TAG2 008 

TAG3 A03 

TAG4 B04 

TAGS B03 

TAG6 C07 

TAG7 A02 

TAG8 007 

TAG9 A01 

TCK P03 

TOI N03 

TOO C04 

TMS P02 

WAY LI 5 

WBA M14 

WBTYP N15 

WBWE# M15 

WBWT#[WRMRST] K14 

WR# B17 

WRARR# LI 4 


NC A14, A15,S01,S02 

Vcc A05-A08, A10-A13.E01,E17, 
HOI, HI 7, K01,K17, L01,L17, 
C09, N17, FI 7, G01,G17, 

Ml 7, N01,S05-S13 

Vss B05-B08, B10-B11,B13, E02, 

El 6, F02, H02, H16, J02, J16, 

K02, K04, K16, L02-L03, LI 6, 

CIO, N16, G02, G16, R02, R05- 
R10, M16, N02, R11-R13 


Table 1-2. 82490XP Pin Cross Reference by Name 


Signal 

Location 

Signal 

Location 

Signal 

Location 

AO 

65 

A1 

66 

A10 

77 

All 

78 

A12 

79 

A13 

80 

A14 

81 

A15 

82 

A2 

67 

A3 

68 

A4 • 

69 

A5 

70 

A6 

71 

A7. 

73 

A8 

75 

A9 

76 

AOS# 

63 

BE# 

64 

BLAST# 

59 

BOFF# 

36 

BROY# 

60 

BROYC# 

. 61 

BUS# 

40 

COATAO 

48 

COATA1 

54 

COATA2 

49 

COATA3 

55 

COATA5 

51 

COATA6 

52 

COATA7 

57 

COATA4 

46 

CLK 

30 

CROY# 

43 

HITM# 

62 

MAWEA# 

41 

MBRDY#[MISTB] 

22 

MCLK[MSTB##] 

26 

MCYC# 

42 

MOATAO 

18 

M0ATA1 

14 

MOATA2 

10 

MOATA3 

6 

M0ATA4 

16 

MOATA5 

12 

MOATA6 

8 

MOATA7 

4 

MOOE# 

20 

MEOC# 

23 

MFRZ#[MEMLDRV] 

24 

MOCLK[MOSTB] 

27 

MSEL#[MTR4#/... 

] 25 

MZBT#[MX4 #/...] 

21 

PAR# 

32 

RESET 

28 

TCK 

3 

TOI 

2 

TOO 

84 

TMS 

1 

WAY 

45 

WBA 

38 

WBTYP 

37 

WBWE# 

39 

WR# 

58 

WRARR# 

44 



NC 

83 

Vcc 5,9,13,17,29,35,50, 

56, 74 

Vss 7, 11. 15, 19,31,33,34,47, 

53, 72 
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1.2 Quick Pin Reference 


BGT# [C490LDRV] 

I 

Bus Guaranteed Transfer, [82490XP Low Drive] 

This signal is generated by the MBC to the 82495XP. It indicates to the 

82495XP a commitment by the MBC to complete the cycle on the memory 
bus. Until BGT # activation the 82495XP owns the cycle and will abort it if 
intervening snoops happen. After BGT # the cycle Is owned by the MBC until 

Its completion. From BGT# until SWEND# snoops will be accepted, but none 
will be processed until SWEND# activation. 

During RESET’S falling edge, this signal controls the driver’s strength of the 
82495XP to 82490XP interface signals. This strength is a function of the 
cache size, and therefore the number of 82490XP’s. Refer to the layout 
specifications section for more details. 

BLE# 

0 

BE Latch Enable 

The BLE# signal is used to control the enable line of an external ’377-type 
latch. The latch captures the i860 XP CPU’s BE (Byte Enable) signals and 
other CPU provided cycle attributes which do not go through the 82495XP. 

BRDY# 

I 

82495XP Burst Ready 

This is the burst ready indication from the memory bus controller. The MBC 
should connect its burst ready indication to the CPU BRDY#, the 82495XP 
BRDY# and the 82490XP BRDY#. In the CPU, it provides the same function 
as that described in the CPU data sheet. The 82495XP will only use this 
indication for burst tracking purposes. In the 82490XP, it Increments the CPU 
latch burst counter. 

CADS# 

0 

Cache Address Strobe 

This signal is generated by the 82495XP and used by the memory bus 
controller. Its assertion requests execution of a memory bus cycle by the 
memory bus controller. This signal when active Indicates that the cache cycle 
control and attribute signals are valid. 

CAHOLD 

0 

82495XP AHOLD 

This signal Is generated by the 82495XP to track the CPU AHOLD signal 
when used for warm-reset and LOCKed sequences. It also provides 
information about CPU and cache BIST. 

CD/C# 

0 

Cache Data/Control 

This Is a cycle definition signal driven by the 82495XP. It indicates the type of 
memory bus cycle requested. This signal Is valid with CADS# and can be 
pipelined by the memory bus controller. 

CDTS# 

0 

Cache Data Strobe 

This signal is driven by the 82495XP to the memory bus controller. CDTS# for 
read cycles Indicates that in the next CLK the memory bus controller can 
generate the first BRDY# for the read cycle. For write cycles it Indicates 
when data is available on the memory bus. Usage of this signal allows 
complete independency between address strobes (CADS#, SNPADS#) and 
data strobe. 

CFGO-2 

I 

Cache Configuration bits 0-2 

These signals are Inputs to the 82495XP. CFGO-2 allow the 82495XP to be 
configured to 5 different modes. Different modes indicate 82495XP/CPU line 
ratio, tag size (4K/8K), lines per sector. 
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1.2 Quick Pin Reference (Continued) 


CLK 

I 

Clock 

This signal provides the fundamental timing for the 82495XP, 82490XP and 
CPU. It must be provided to the 82495XP, 82490XPs, CPU and memory bus 
controller components with minimal skew. 

CM/IO# 

0 

Cache Memory/10 

This signal is driven by the 82495XP and is a cycle definition signal. It 
indicates the type of memory bus cycle requested. This signal Is valid with 
CADS# and can be pipelined by the memory bus controller. 

CNA#[CFG0] 

1 

82495XP Next Address Enable, [Configuration Pin 0] 

This signal is driven by the memory bus controller and supplied to the 
82495XP. It is used by the memory bus controller to dynamically pipeline 
CADS# cycles. 

During RESET falling edge It functions as the 82495XP CFGO input. 

CRDY#[SLFTST#] 


Cache Memory Bus Ready, [82495XP Self Test] 

This signal is generated by the memory bus controller and informs the 
82495XP and 82490XP that a memory bus cycle has been completed. 

CRDY# activation ends the memory bus cycle. 

During RESET’S falling edge, if this signal is sampled low(active) and 

MBALE is sampled high(active), 82495XP self test will be invoked. 

CWAY 

0 

Cache Way 

CWAY is driven by the 82495XP and is a cycle definition signal that 
indicates to the memory bus controller the WAY to be used by the 
requested cycle. On line-fills it indicates the way the line will be loaded. For 
write-backs it indicates the WAY that was written-back. This signal is valid 
with CADS#. 

CW/R# 

0 

Cache Write/ Read 

This signal is driven by the 82495XP and Is a 82495XP cycle definition 
signal. It indicates the type of memory bus cycle requested. This signal is 
valid with CADS# and can be pipelined by the memory bus controller. 

DRCTM# 

1 

Memory Bus Direct to [M] State 

This signal is an Input to the 82495XP. It is the mechanism by which the 
memory bus can dynamically Inform the 82495XP of a request to skip the 
[E] state and move the line directly to the [M] state. This signal is sampled 
by the 82495XP when SWEND# is asserted. 

FLUSH# [NCPFLD#] 

I 

1 

Flush the 82495XP cache, [Enable Non-Cacheable PFLD] 

This signal is an Input to the 82495XP. Flush when active will cause the 
82495XP to write-back all of its modified lines Into main memory then 
invalidate all tag locations. At the end of a flush operation the 82495XP tag 
array will be completely invalidated. 

During RESET activation, this pin functions as the NCPFLD# configuration 
signal which, with FPFLDEN, selects one of three modes for handling 

I860 XP CPU floating point load cycles. 
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1.2 Quick Pin Reference (Continued) 


FPFLD#[FPFLDEN] 

I/O 

FIFO PFLD Enable [PFLD Mode Select] 

During RESET, FPFLDEN and NCPFLDEN# inputs select one of three 
modes to handle i860 XP CPU pipelined floating point load cycles. In the 
mode which supports an external FIFO, the FPFLD# output indicates a 

PFLD cycle to be loaded into the FIFO. 

FSIOUT# 

0 

Flush/Sync/Initialization Output 

This signal Is an output of the 82495XP and indicates the start and end of 
three operations: Flush, Sync, and Initialization. The output is activated 
when the operation Internally begins and is de-activated when the 
operation ends. 

KLOCK# 

0 

82495XP LOCK# 

This signal is driven by the 82495XP and Indicates to the memory bus 
controller a request to execute atomic read-modify-write sequences. 
KLOCK# is active with the CADS# of the first LOCKed operation and 
remains active until at least the clock following CADS# of the last cycle of 
LOCKed operation. 

KWEND#[CFG2] 

I 

Cacheability Window End, [Configuration Pin 2] 

This signal is generated by the MBC and indicates to the 82495XP that the 
Cacheability Window has expired. At this point the 82495XP will latch the 
memory cacheability signal (MKEN#) and make decisions based on the 
cacheability attribute. MRO# which Indicates the Read-Only cycle attribute 
is also sampled at this point. 

During RESET’s falling edge this line functions as the CFG2 configuration 
signal which is used to configure the 82495XP/82490XP with cache 
parameters. 

MALE[WWOR#] 

I 

Memory Bus, Address Latch Enable [Weak Write Ordering] 

This signal is generated by the memory bus controller, and controls a 
82495XP internal transparent address latch (373 like). CADS# will 
generate a new address at the input of the internal address latch. MALE 
activatlon(high) will allow the flowing of this address to the memory bus 
provided MAOE# is active. When MALE inactive(low), the address at the 
latch Input is latched. 

WWOR # configures the 82495XP Into strong or weak write-ordering 
mode. 

MME# 

I 

Memory Bus Address Output Enable 

This signal is generated by the memory bus controller and controls the 
82495XP’s output buffer of the memory bus address latches. The 82495XP 
drives the memory bus address lines if MAOE# Is active (low). Otherwise, 
it is tristated. MAOE# also serves as a qualifier for snooping cycles: when 
inactive snoops will be enabled. 

MBALE[HIGHZ#] 

I 

Memory Bus, 82495XP sub-line-address Latch Enable [High Impedance 
Output] 

This signal has an exact function as MALE but controls only the 82495XP 
sub-line addresses. This signal Is generated by the memory bus controller, 
and controls a 82495XP internal transparent address latch (373 like). 

CADS# will generate a new address at the input of the Internal address 
latch. MBALE actlvation(hlgh) will allow the flowing of the sub-line address 
to the memory bus provided MBAOE# is active. When MALE inactive(low), 
the sub-line address at the latch input is latched. 

HIGHZ#, if active along with SLFTST#, causes the 82495XP to float all of 
its outputs. 
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1.2 Quick Pin Reference (Continued) 


MBAOE# 

1 

Memory Bus, 82495XP sub-line Address Output Enable 

This signal has a similar function than MAOE#, but controls only the 

82495XP sub-line addresses. 

If MBAOE# is actlve(low), the 82495XP will drive the sub-line portion of the 
address onto the memory bus. Otherwise, it is tristated. MBAOE# is also 
sampled during snoop cycles. If MBAOE# is sampled inactive with 

SNPSTB#, the snoop write back cycle(if any) will begin at the sub-line 
address provided. If MBAOE# is active with SNPSTB#, the snoop write 
back will begin at sub-line address 0. 

MBRDY#(MISTB) 

1 

Memory Bus Ready, (Memory Input Strobe) 

This pin is an input to the 82490XP. It is used in clocked bus mode to 
indicate the end of a transfer. When active(low) it indicates that the 

82490XP should increment the burst counter and either output the next 
data or get ready to accept the next data. 

In strobed memory bus mode this pin is the input data strobe to the 

82490XP. On each MISTB edge, the 82490XP latches the data and 
increments the burst counter. 

MCACHE# 

0 

82495XP Internal Cacheability 

This signal is driven by the 82495XP. On read cycles, this signal indicates 
the cycle’s internal cacheability attribute. In write cycles MCACHE# is only 
active for write-back cycles. MCACHE# is not activated for I/O, special 
cycles and Locked Cycles. 

MCFA6-MCFA0 

I/O 

Memory Bus Configurable address lines 

MSET10-MSET0 

I/O 

Memory bus SET number 

MTAG11-MTAGO 

I/O 

Memory bus T AG bits 

These are the memory bus address lines of the 82495XP and should be 
connected to the A31 -A2 (A31 -A3 for 64 bit bus) signals of the Memory 

Bus. These signals, along with the byte enables, define the physical area of 
memory or I/O accessed. 

The 82495XP drive these signals in normal memory bus cycles and have 
them as inputs during snooping. 

MCLKiMSTBM#] 

1 

Memory Bus Clock, [Memory Input Strobe] 

In clocked memory bus mode this pin provides the memory bus clock to the 
82490XP. In clocked mode, memory bus signals and memory bus data are 
sampled on the rising edge of the MCLK. In a clocked memory bus write, 
data is driven off of MCLK or MOCLK depending upon the configuration. 

This pin is an input to the 82490XP. It Is sampled during reset and 
determines the memory bus type. If active(low), the memory bus will be 
strobed. If inactive (high), the memory bus will be clocked. 

If a clock is detected at this input, this pin becomes the memory bus clock, 
and clocked memory bus mode Is selected. 

MDATA0-MDATA7 

I/O 

Memory Bus Data 

These pins are the 8 memory data pins of the 82490XP. All or part of these 
pins will be used depending on the cache configuration. In clocked memory 
bus mode, these pins are sampled with the rising edge of MCLK. New data 
is driven out on these pins with MEOC# or the rising edge of MCLK or 
MOCLK together with MBRDY # active. In strobed memory bus mode, 
these pins are sampled on each MISTB edge. New data Is driven out on 
these pins with each MOSTB edge. 
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1.2 Quick Pin Reference (Continued) 


MDOE# 

1 

Memory Data Output Enable 

This signal is an input to the 82490XP. The memory bus output enable is 
used to control the 82490XP’s driving of data onto the memory bus. When 
this pin is inactive(high), the MDATA[0:7] pins are tristated. When this pin is 
active(low), the l\/IDATA[0:7l pins are actively driving data. The function of 
this pin is the same for strobed or clocked memory bus operation as 

MDOE# has no relation to CLK or MCLK. 

MEOC# 

1 

Memory End of Cycle 

This signal is an input to the 82490XP. Since It is synchronous to the 
memory bus, it may be used to end a cycle on the memory bus and begin a 
pending cycle without waiting for synchronization to the CPU CLK. MEOC# 
also causes the latching or driving of data and resetting of the memory burst 
counter. 

MFRZ#[MEMLDRV] 

1 

Memory Freeze, [Memory Bus Low Drivel 

This signal is an input to the 82490XP. It is used for write cycles that could 
cause allocation cycles. When this pin is active(low), write data is latched in 
the 82490XP. The subsequent allocation will not overwrite data latched by 
the write. This prevents the actual write to memory from having to be 
performed on the memory bus. The allocated line will be placed In the [M] 
state in the cache since memory has not been updated. 

During RESET’S falling edge, this signal Is sampled to Indicate the 

82490XP’s memory bus driving strength. The 82490XP provides normal and 
high drive capability buffers. 

MHITM# 

0 

Memory Bus Hit to Modified Line 

This signal is driven by the 82495XP during snoop cycles and Indicates 
whether the snooping address hit a Modified line in the 82495XP cache. The 
82495XP automatically schedules the writing-back of modified lines when 
snoop hits occur. MHITM# is activated the CLK after SNPCYC# and will 
remain active until the next SNPSTB#. 

MKEN# 

1 

Memory Bus Cacheability 

This signal is an Input to the 82495XP. It is the memory bus cache enable 
pin. It is used to Indicate to the 82495XP if the current memory bus cycle is 
cacheable or not. This pin is sampled by the 82495XP with KWEND# 
assertion. 

MOCLK(MOSTB) 

1 

Memory Output Clock, (Memory Output Strobe) 

MOCLK controls a transparent latch at the 82490XP data outputs. By 
providing a clock input, skewed from MCLK, MDATA hold time may be 
increased. 

In strobed bus mode this pin is the data output strobe. On each MOSTB 
edge, new data will be output onto the memory bus. 

MRO# 

1 

Memory Bus Read-Only 

This pin is an input to the 82495XP. It is the READ-ONLY attribute pin. It is 
used to indicate to the 82495XP that the accessed line should get a READ- 
ONLY attribute. READ-ONLY lines will be non-cacheable in the first level 
cache. READ-ONLY lines will be cached in the 82495XP if MKEN# is 
sampled active during KWEND# and will be cached in the [S] state. This pin 
is sampled by the 82495XP with KWEND# assertion. 
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1.2 Quick Pin Reference (Continued) 


MSEL#[MTR4/TR8#] 

1 

Memory Select, [Memory Transfer] 

This signal is a chip select input to the 82490XP. MSEL# activation 
qualifies the MBRDY# input of the 82490XP. MSEL# going active causes 
the sampling of MZBT # for the next cycle. MSEL# going inactive resets 
the 82490XP’s internal memory burst counter. 

This pin is used to determine the number of transfers necessary on the 
memory bus for each cache line. If high, there are 4 transfers on the 
memory bus for each cache line. If low, there are 8 transfers on the 
memory bus for each cache line. 

MTHIT# 

0 

Memory Bus Tag Hit 

This signal is driven by the 82495XP during snoop cycles. It Indicates 
whether the snooping address hit any line (exclusive, shared, or modified) 
in the 82495XP cache. MTHIT # is activated the CLK after SNPCYC# and 
will remain active until the next SNPSTB#. 

MWB/WT# 

1 

Memory Bus Write Policy 

This signal is an input to the 82495XP. It is the mechanism by which the 
memory bus can dynamically inform the 82495XP of the cycle write policy 
(Write-Through/Write-Back). This signal is sampled by the 82495XP with 
SWEND# activation. 

MZBT#[MX4/MX8#] 

1 

Memory Zero Based Transfer, [Memory I/O Bits] 

This signal is an Input to the 82490XP. When this pin is sampled active 
(with MSEL# or MEOC#) it Indicates that the memory bus cycle should 
start with burst location zero independent of the sub-line address 
requested by the CPU. 

This pin Is used to determine the number of 10 pins used for the memory 
bus. When HIGH it Indicates that 4 10 pins are used per 82490XP. When 
LOW It indicates that 8 10 pins are used. 

NENE# 

0 

Next Near 

This signal Is generated by the 82495XP and indicates to the memory bus 
controller if the address of the requested memory cycle is “near” the 
address of the previously generated one (in the same 2K DRAM page). 

This information can be used by the memory bus controller to optimize 
access to paged or static column DR AMs. This signal is valid together with 
CADS#. 

PALLC# 

0 

Potential Allocate 

This signal is generated by the 82495XP and indicates to the memory bus 
controller that the current write cycle can potentially allocate a cache line. 
Potential allocate cycles are cycles which are 82495XP misses with PCD, 
PWT inactive. 

RDYSRC 

0 

Ready Source 

This signal is an output of the 82495XP. It indicates the source of the 

BRDY generation for the CPU. When high it indicates that the memory bus 
controller should generate BRDYs to the CPU, when low it Indicates that 
the 82495XP will be the one providing BRDYs. 
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1.2 Quick Pin Reference (Continued) 


RESET 

I 

Reset 

This signal forces the 82495XP and 82490XP to begin execution at a known state. It’s 
falling edge will sample the state of the configuration pins. RESET is an asynchronous 
input to the 82495XP and 82490XP. 

The following 82495XP pins are sampled during reset falling edge: 

CNA# [CFGO]: CFGO line of 82495XP configuration inputs. 

SWEND# [CFG1]: CFG1 line of 82495XP configuration inputs. 

KWEND# [CFG2]: CFG2 line of 82495XP configuration inputs. 

FLUSH# [NCPFLD#]: Enables decoding of the non-cacheable PFLD mode. Active if low. 
FPFLD# [FPFLDEN]: Enables the external FIFO pfid mode. Active high. 

BGT # [C490LDRV]: Indicates the driving strength of the 82495XP/82490XP interface. If 
high, the 82495XP can drive up to 10 82490XP’s without derating. If low, the 82495XP 
can drive up to 1 8 82490XP’s without derating. 

SYNC# [MEMLDRV]: Indicates the 82495XP’s memory bus driving strength. 
SNPCLK[SNPMD]: Indicates the snoop mode, synchronous or asynchronous. 

CFG0-CFG2 signals are used to configure the 82495XP/82490XP with cache 
parameters. They define the lines/sector, line ratio, and number of tags. 

MALE[WWOR#]; Enforces strong or weak write-ordering consistency. 

MBALE[HIGHZ#]: If active along with SLFTST # will tristate all 82495XP outputs. 

The following 82490XP pins are sampled during reset falling edge: 

PAR#: If active(low), this pin configures the 82490XP as a parity storage device. The 
parity configuration stores the paritybits belonging to data stored in other 82490XP’s. 

MZBT # [MX4/MX8#]: Determines the number of 10 pins used for the memory bus 
interface. If high, four 10 pins are chosen. If low, eight 10 pins are chosen. 

MSEL# [MT4/MT8#]: Determines the number of transfers necessary on the memory bus 
for each cache line. If high, four memory bus transfers are needed to fill a cache line. If 
low, eight memory bus tranters are needed to fill a cache line. 

MCLK[MSTBM#]: If active(low), this pin indicates a strobed memory bus configuration. If 
inactlve(high), a clocked memory bus is chosen. 

MFRZ# [MEMLDRV]: Indicates the 82490XP’s memory bus driving strength. 

SMLN# 

0 

Same Cache Line 

This signal is an output of the 82495XP. It is used to indicate to the memory bus controller 
that the current cycle is to the same 82495XP line as the previous one. This Indication 
can be used by the memory bus controller to selectively activate its SNPSTB# signal to 
other caches. For example, back to back snoop hits to the same line may be snooped 
only once. This signal is valid together with CADS#. 
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1.2 Quick Pin Reference (Continued) 


SNPADS# 

0 

Cache Snoop Address Strobe 

This signal is an output of the 82495XP. It has an Identical functionality as 
CADS#, but Is generated only on snooping-write-back cycles. Considering that 
snoop write-back cycles are the only ones which are generated Independent of 
CPU bus activity, this separate address strobe should ease implementation of 
the memory bus controller. Whenever active, the memory bus controller should 
abort all pending cycles (cycles for which BGT # was not issued yet. After 

BGT # the memory bus controller is responsible for the cycle completion). The 
82495XP assumes that non-committed cycles are aborted upon SNPADS# 
and may re-issue them again after the completion of the snoop. 

SNPBSY# 

0 

Snoop Busy 

This signal is driven by the 82495XP. When Inactive(high), it indicates that the 
82495XP is ready to accept another snoop cycle. SNPBSY # will be activated 
for one of two reasons: A snoop hit to a modified line, a back-invalidation is 
needed when there is one already in progress. In either of these cases, the 
82495XP will not perform the look-up for a pending snoop until SNPBSY# is 
de-activated. 

SNPCLK[SNPMD] 


Snoop Clock [Snoop Mode] 

This pin provides the 82495XP with the snoop clock to be used in clocked 
memory Interfaces. During clocked mode SNPSTB#, SNPINV, SNPNCA, 
MBAOE#, MAOE#, and the Address lines will be sampled by SNPCLK. 

During RESET activation, this pin functions as the SNPMD (snoop mode) 
signal. If high it indicates strobed snooping mode. If low it indicates 
synchronous snooping mode. For clocked snooping mode, SNPCLK is 
connected to the snoop clock source. 

SNPCYC# 

0 

Snoop Cycle 

This signal Is an output of the 82495XP. It indicates when the snooping look-up 
is actually taking place in the 82495XP tag RAM. 

SNPINV 

1 

Snoop Invalidation 

This signal is an input to the 82495XP and indicates the resulting line state in 
case of a snoop hit cycle. If active, It forces the line to go to an invalid state. 

This signal is sampled with SNPSTB # . 

SNPNCA 

1 

Snoop Non Caching Device Access 

This signal is an input to the 82495XP and provides the 82495XP information 
on whether the current memory bus master Is a non caching device (DMA, 
etc). This indication allows the 82495XP to avoid changing line states from 
exclusive to shared unnecessarily. 

SNPSTB# 

1 

Snoop Strobe 

This signal is an input to the 82495XP which is used to initiate a snoop. 

SNPSTB# causes the latching of the snoop address and parameters. The 
82495XP supports three latching modes: Clocked, Strobed, Synchronous. In 
the clocked mode, address and attribute signals will be latched with the 
activation of SNPSTB#. SNPCLK. In the strobed mode, address and attributes 
will be latched by the SNPSTB# falling edge. In synchronous mode, address 
and attribute signals will be latched with the activation of SNPSTB #.CLK. 
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1 .2 Quick Pin Reference (Continued) 


SWEND# [CFG1] 

1 

Snoop Window End, [Configuration Pin 1] 

This signal is generated by the MBC and indicates to the 82495XP that the 
Snoop Window has expired. At this point the 82495XP will latch the memory 
bus attributes: write policy (MWB/WT #), and direct to [M] transfer 
(DRCTM#). At the end of the snooping window, all other devices have 
snooped the bus master’s address and have generated address caching 
attributes on the bus. Once a cycle begins, the 82495XP prevents snooping 
until it has received SWEND#. The 82495XP will act based on those 
attributes and will update Its tag RAM. 

During RESET’S falling edge this line functions as the CFG1 configuration 
signal which Is used to configure the 82495XP/82490XP with cache 
parameters. 

SYNC#[MEMLDRV] 

1 

Synchronize 82495XP cache, [Memory Bus Low Drive] 

This signal is an Input to the 82495XP. Activation of this line will cause the 
synchronization of the 82495XP tag array with main memory. All 82495XP 
modified lines will be written back to main memory. The difference between 
FLUSH and SYNC is that on SYNC the 82495XP and CPU tag array will NOT 
be invalidated. All the valid entries will be kept, with all modified lines 
(M state) becoming non-modified (E state). 

During RESET’s falling edge, this signal is sampled to Indicate the memory 
bus driving strength. If It is sampled low, the maximum capacitive load 
without derating Is 1 0Opf. If It is sampled high, the maximum capacitive load 
without derating is SOpf. 

TCK 

1 

Testability Clock 

This signal is an input to both the 82495XP and 82490XP. This is the 
boundary scan clock. This signal has to be connected to a clock 
synchronous to CLK to insure initialization of the test logic. 

TDI 

1 

Testability serial input 

This signal is an input to both the 82495XP and 82490XP. 

TDO 

0 

Testability serial output 

This signal is an output of both the 82495XP and 82490XP. 

TMS 

1 

Testability Control 

This signal is an Input to both the 82495XP and 82490XP. 


The following pins have internal pull-ups: 

ADS#, NA#, FPFLD#, TDI, TMS, BGT#, 
KWEND#, SWEND#, CNA#, BRDY#, SYNC#, 
FLUSH#, SNPSTB#, MRO#, DRCTM#, TCK, 
SNPCLK, MFRZ#, MZBT#, MCLK, MOCLK. 


During tri-state output testing sequence, all pull-ups 
will be disabled. 

The following signals are glitch free. These signals 
are always at a valid logic level following RESET: 

CADS#, CDTS#, SNPADS#, SNPCYC#. 
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1.3 Output Pins 

Table 1-3 lists all output pins, from which part(s) they are driven, and their active levels. 


Table 1-3. Output Pins 


Name 

Part 

Active Level 

Name 

Part 

Active Level 

BLE# 

82495XP 

LOW 

MTHIT# 

82495XP 

LOW 

CADS# 

82495XP 

LOW 

NENE# 

82495XP 

LOW 

CAHOLD 

82495XP 

HIGH 

PALLC# 

82495XP 

LOW 

CDTS# 

82495XP 

LOW 

RDYSRC 

82495XP 

HIGH 

CWAY 

82495XP 

- 

SMLN# 

82495XP 

LOW 

CW/R#,CD/C#,CM/IO# 

82495XP 

- 

SNPADS# 

82495XP 

LOW 

FSIOUT# 

82495XP 

LOW 

SNPBSY# 

82495XP 

LOW 

KLOCK# 

82495XP 

LOW 

SNPCYC# 

82495XP 

LOW 

MCACHE# 

82495XP 

LOW 

TDO 

82495XP/82490XP 

- 

MHITM# 

82495XP 

LOW 





1.4 Input Pins 

Table 1-4 lists all Input pins, which part(s) they are input to, their active level, and whether they are synchro- 
nous or asynchronous Inputs. 


Table 1-4. Input Pins 


Name 

Part 

Active Level 

Synchronous/ Asynchronous 

BGT#[C490LDRV] 

82495XP 

LOW 

Synchronous to CLK 

BRDY# 

82495XP/82490XP 

LOW 

Synchronous to CLK 

CLK 

82495XP/82490XP 

- 

- 

CFG3 

82495XP 

- 

Synchronous to CLK 

CNA# (CFGO) 

82495XP 

LOW 

Synchronous to CLK 

CRDY#[SLFTST#] 

82495XP/82490XP 

LOW 

Synchronous to CLK 

DRCTM# 

82495XP 

LOW 

Note 2 

FLUSH# [NCPFLD#] 

82495XP 

LOW 

Asynchronous 

CPUTYP 

82495XP 

LOW 

Synchronous to CLK 

KWEND# (CFG2) 

82495XP 

LOW 

Synchronous to CLK 

MALE, MBALE 

82495XP 

HIGH 

Asynchronous 

MAOE#,MBAOE# 

82495XP 

LOW 

Asynchronous 

MCLK[MSTBM#] 

82490XP 

LOW 

Synchronouos to MCLK 

MBRDY# (MISTB) 

82490XP 


- 

MDOE# 

82490XP 

LOW 

Asynchronous 

MEOC# 

82490XP 

LOW 

Synchronous/Asynchronous, Note 1 
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Table 1-4. Input Pins (Continued) 


Name 

Part 

Active Level 

Synchronous/ Asynchronous 

MFRZ# 

82490XP 

Low 

Synchronous/Asynchronous, Note 1 

MOCLK(MOSTB) 

82490XP 



MSEL[MTR4/TR8#] 

82490XP 

Low 

Synchronous/Asynchronous, Note 1 

MZBT#[MX4/MX8#] 

82490XP 

Low 


MKEN# 

82495XP 

LOW 

Note 2 

MRO# 

82495XP 

LOW 

Note 2 

MWB/WT# 

82495XP 

- 

Note 2 

PAR# 

82490XP 

Low 

Synchronous to CLK 

RESET 

82495XP/82490XP 

HIGH 

Asynchronous 

SNPCLK[SNPMD] 

82495XP 

- 

- 

SNPINV 

82495XP 

HIGH 

Notes 

SNPNCA 

82495XP 

HIGH 

Note 3 

SNPSTB# 

82495XP 

LOW 

Note 3 

SWEND# (CFG1) 

82495XP 

LOW 

Synchronous to CLK 

SYNC#[MEMLDRV] 

82495XP 

LOW 

Asynchronous 

TCK 

82495XP/82490XP 

- 


TDI 

82495XP/82490XP 

- 

Synchronous to TCK 

TMS 

82495XP/82490XP 

- 

Synchronous to TCK 


NOTES: 

(1) In Clocked memory bus mode these pins are synchronous with MCLK. In Strobed memory bus mode these pins are 
asynchronous. 

(2) MWB/WT#, DRCTM# must be synchronous to CLK during SWEND#. MKEN#, MRO# must be synchronous to CLK 
during KWEND#. 

(3) In clocked memory bus mode these pins are synchronous with SNPCLK. In strobed memory mode these pins are 
asynchronous. 

1.5 Input/Output Pins 

Table 1-5 lists all input/output pins, which part they interface with, and when they are floated. 


Table 1-5. Input/Output Pins 


Name 

Part 

Synch/ Asynch 

When Floated 

FPFLD#[FPFLDEN] 

82495XP 

Synchronous to CLK 

- 

MCFA0-MCFA6 

82495XP 

Note 1 

MAOE# = High 

MDATA0-MDATA7 

82490XP 

Note 2 

MDOE# = Hight and during Reset 

MSETO-MSETIO 

82495XP 

Note 1 

MAOE# = High 

MTAG0-MTAG11 

82495XP 

Note 1 

MAOE# = High 


NOTES: 

(1) With MALE high and MAOE# low, these pins are synchronous to CLK. 

(2) In Clocked memory bus mode these pins are synchronous with MCLK. In Strobed memory bus mode these pins are 
asynchronous. 
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1.6 Pin State During Reset 


Table 1-6. Pin State During Reset 


Pin Name 

Pin State during Reset 

CADS#, CDTS#, SNPADS# 

High 

CW/R#, CD/C#, CM/10#, MCACHE# 

Undefined 

RDYSRC, PALLC#, CWAY 

Undefined 

NENE#,SMLN# 

Undefined 

KLOCK# 

High 

FPFLD# 

High 

MSET0-MSET10, MTAG0-MTAG1 1, MCFA0-MCFA6 

Note 1 

CAHOLD 

Note 2 

MHITM#,MTHIT# 

High 

SNPCYC#,SNPBSY# 

High 

TDO 

Note 3 


NOTES: 

(1) MSET, MTAG, and MCFA signals are high impedance during reset if MAOE# and MBAOE# are deasserted. 

(2) The state of CAHOLD depends on whether self-test is selected (see testability chapter for details). 

(3) The State of TOO is controlled by the boundary scan which is independent of other signals including RESET (see 
testability chapter for details). 

2.0 CHiPSET iNTRODUCTiON 

The 82495XP/82490XP is a second-level cache 
controller chipset for the i860 XP CPU. The chipset 
provides a unified code and data cache which is 
software transparent. The 82495XP/82490XP has 
been designed to support a high-speed CPU/cache 
core interface, and a same or lower speed memory 
bus interface. 

The 82495XP is the cache controller. It contains 8K 
tags and control logic to control up to a 51 2K size 
cache. The 82490XP is a custom cache data RAM 
designed to be used with the 82495XP. Between 8 
and 1 8 82490XPs are required to create a 256K to 
51 2K cache, respectively. The memory bus control- 
ler (MBC) is the set of logic required to interface the 
82495XP and 82490XP to the memory bus. The 
MBC provides product differentiation, and its imple- 
mentation ultimately determines system perform- 
ance. 


2.1 Main Features 

The 82495XP/82490XP have the following main 
features: 

— Tracks the speed of the i860 XP CPU 

— Large Cache Size support: 

4K or 8K Tags 

1 or 2 lines per sector 
4 or 8 transactions per line 
64 or 1 28-bit wide memory bus 
256K or 512K cache 

— Write-Back cache with full multiprocessing con- 
sistency support: 

supports the MESI protocol 

watches memory bus to guarantee 1st level, 2nd 
level cache consistency 

maintains inclusion 

— Two-way set-associative with MRU hit prediction 
algorithm 
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— Zero wait state hit cycles on MRU hit. One wait 
state on MRU misses 

— Concurrent CPU and Memory Bus transactions 

— Supports synchronous, asynchronous, and 
strobed memory bus architectures 


2.2 CPU/Cache Core Description 

Figure 2-1 depicts a block diagram of the basic 
cache subsystem. The cache subsystem provides a 
gateway between the CPU and the memory bus. All 
CPU accesses which can be serviced locally by the 
cache subsystem will be filtered out from the memo- 
ry bus traffic. Therefore local cycles (CPU cycles 
which hit the cache and do not require a memory 
bus cycle) will be completely invisible to the memory 
bus providing the reduction In memory bus band- 
width necessary for multiprocessing systems. Anoth- 
er very important function of the 82495XP cache 
subsystem Is to provide speed decoupling between 
the CPU and memory busses. Processors are quick- 
ly achieving operating frequencies which can be 
very difficult for the memory subsystem to meet. The 
82495XP cache subsystem is optimized to serve the 
CPU with zero wait-states up to very high frequen- 
cies (50 Mhz), at the same time providing the decou- 
pling necessary to run slower memory bus cycles. 

The Basic Functions of the cache subsystem ele- 
ments are: 

82495XP: Main control element, includes the tags 
and line states and provides hit or miss decisions. It 



Figure 2-1. 82495XP Cache Subsystem 


handles the CPU bus requests completely and coor- 
dinates with the memory bus controller when an ac- 
cess needs the memory bus. It controls the 
82490XP data paths for both hits/ misses to provide 
the CPU with the correct data, it dynamically adds 
wait states based on the MRU prediction mecha- 
nism. The 82495XP is also responsible for perform- 
ing memory bus snoop operations while other devic- 
es are using the memory bus. The 82495XP drives 
the cycle address and other attributes during a 
memory bus access. A block diagram of the 
82495XP Is shown in Figure 2-2. 



To/From Memory Bus 


Figure 2-2. 82495XP Block Diagram 
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82490XP: Implements the cache SRAM storage and 
data path. It includes latches, muxes, logic which 
allow it to work in lock-step with the 82495XP to 
efficiently serve both hit and miss accesses. It takes 
full advantage of Internal silicon flexibility to provide 
a degree of performance otherwise unachievable 
with discrete implementations. It supports zero wait 
state hit accesses, concurrent CPU and memory bus 
accesses, and includes a replication of the MRU bits 
for autonomous way prediction. During memory bus 
cycles it acts as a gateway between CPU and mem- 
ory buses. A block diagram of the 82490XP is shown 
in Figure 2-3. 


Memory Bus Controller: Server for memory bus cy- 
cles. It adapts the CPU /Cache core to a specific 
memory bus protocol. It coordinates with the 
82495XP line fills, flushes, write-backs, etc. The 
memory bus controller’s flexibility allows customers 
to easily adapt the 82495XP cache subsystem to 
their specific architectures, and to provide their own 
differentiation. Figure 2-4 shows an example memo- 
ry bus controller. The MBC handles all cycle control, 
data transferring, snooping, and any synchroniza- 
tion. 



To/From Memory Bus 

240956-7 


Figure 2-3. 82490XP Block Diagram 



& Control Bus & Control Bus „ 

240956-8 


Figure 2-4. MBC Example Block Diagram 
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3.0 CACHE OVERVIEW 

This chapter gives a brief description of 82495XP/ 
82490XP configurations, interface, snooping mecha- 
nism, cycle control mechanism, and memory bus 
control mechanism. Each section of this overview is 
described in more detail in later chapters. 


3.1 Configuration 

The 82495XP/82490XP cache chipset offers a num- 
ber of configuration options. The system designer 
can choose from a number of different operating 
characteristics, including memory bus modes, 
snooping modes, and internal physical attributes 
(line size, lines per sector, etc.). The flexibility of 
these configuration options allow the 82495XP/ 
82490XP cache to be used in a wide range of appli- 
cations. 

Configurations are selected by altering the 
82495XP/82490XP inputs during RESET. They are 
not dynamically changeable, and to conserve pins 
some configuration inputs become 82495XP or 
82490XP inputs/outputs after RESET. 

3.1.1 PHYSICAL CACHE 

Physically, the 82495XP/82490XP can be config- 
ured to support many different cache configurations. 
By selecting one cache configuration, other configu- 
rations may be excluded. The 82495XP/82490XP 
can be configured to support: 

— 256Kor 512Kcache 

— 64 or 1 28 bit wide memory bus 

— One or two lines per sector 


— 1:1, 1:2, or 1:4 CPU to 82495XP line size ratio 

— 4 or 8 memory bus transactions per line 

— 4K or 8K tag size 

— Strong or weak write ordering 

Figure 3-1 summarizes the basic configurations 
available when using the 82495XP/82490XP. 

3.1.2 SNOOP MODES 

When another master snoops the 82495XP, the 
MBC must initiate the snoop request and pass on 
the response. The 82495XP allows the MBC to initi- 
ate this snoop request in one of three modes: syn- 
chronous, clocked, and strobed. The snoop re- 
sponse of the 82495XP is always synchronous. 

When Initiating the snoop In synchronous snoop 
mode, all snoop information is latched by the 
82495XP synchronous to the CPU CLK. The snoop 
is then performed on the next CLK edge and the 
response given on the CLK edge after that. This is 
the fastest possible method of snooping. 

In clocked snooping mode, information is latched by 
the 82495XP with respect to an external snoop 
clock (slower than CLK) source. The 82495XP must 
internally synchronize this Information to CLK and 
provide a response. 

In strobed snooping mode, information is latched 
into the 82495XP with respect to the falling edge of 
another signal. Thus, the snoop initiation is clock in- 
dependent. The 82495XP again synchronizes this in- 
formation with CLK. 


MEM BUS 

= 64 Bits 

MEM BUS 

= 128 Bits 

Number of 

4 Trans. 

8 Trans. 

4 Trans. 

8 Trans. 

82490XP Devices 

1 

2 




LR = 1 

LR = 2 




Tags = 8k 

Tags = 4k 



8 

L/S = 1 

L/S = 1 




3 

4 

4 

5 


LR = 1 

LR = 2 

LR = 2 

LR = 4 


Tags = 8k 

Tags = 8k 

Tags = 8k 

Tags = 4k 

16 

L/S = 2 

L/S = 1 

L/S = 1 

L/S = 1 



Not Supported LR = 82495XP/CPU Line Ratio 
L/S = 82495XP Lines/Sector 


Cache Device 
2, 4. 8 Bits Wide 


Figure 3-1. 82495XP/82490XP Configurations 
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3.1.3 MEMORY BUS MODES 

The 82490XP may be configured to be in one of two 
memory bus modes. This mode determines how 
data will be passed on to and off of the data bus. 
The two modes are clocked mode and strobed 
mode. These modes need not have any relation to 
the snoop mode chosen. 

In clocked mode, data is driven from an external 
memory clock source called MCLK, or read with re- 
spect to MCLK. MCLK is completely independent of 
the CPU CLK source. There are inherent perform- 
ance advantages, however, in making this clock 
source synchronous or half-clock (divided) synchro- 
nous to the CPU CLK. 

In strobed mode, data is driven from the rising edge 
of one signal, and read with the rising edge of anoth- 
er. Like the strobed snooping mode, this carries no 
clock skew problems, or memory bus speed limita- 
tions. 


3.2 CPU Bus Interface 

The CPU bus interface is the connection of the 
82495XP and 82490XP to the i860 XP CPU. Be- 
cause this interface is optimized to achieve the high 
speed performance, it is not a flexible interface. The 
majority of the signals in the CPU bus interface must 
be connected strictly between the 82495XP/ 
82490XP cache and the i860 XP CPU. Chapter 10 
addresses the use of such signals. 

Some CPU signals are, however, accessible by the 
MBC. These are the following pins: RESET, CLK, 
BRDY2#, INT, BERR, PCHK#, PEN#, TCK, TDI, 
TMS, TRST#, and TDO. CPU pins KBO, KB1, HIT#, 
and BREQ are also available to the MBC, but are of 
limited use in an 82495XP/82490XP system. 

Other CPU pins flow through a ’377 type latch to the 
MBC. The latch enable is controlled by the 82495XP 
through the BLE# pin. The following CPU signals 
flow through this latch; PCD, PWT, BE0#-BE7#, 
CACHE#, LEN, PCYC, and CTYP. 


3.3 82495XP/82490XP Interface 

The 82495XP/82490XP interface is the connection 
between 82495XP and 82490XP. Like the CPU bus 
interface, this isolated Interface is not flexible and 
may not be altered beyond what Intel has provided. 


3.4 Memory Bus and Memory Bus 
Controller Interface 

The memory bus controller (MBC) is the Interface 
logic required to control the 82495XP/82490XP and 
connect It to the memory bus and rest of the system. 
The MBC may be simple enough to support a single- 
CPU write-through cache, or complex enough to 
support a multiprocessing cache with external tags. 
The 82495XP/82490XP is a very flexible chipset, 
and the MBC determines exactly how the 
82495XP/82490XP will work in a system. 

An MBC consists of a few basic blocks: a snoop 
logic block, a cycle control block (with synchronizers 
if necessary), and data path control block. The 
snoop block must be able to communicate with the 
other caches when snooping is necessary. At the 
same time, the cycle control block must interface to 
some arbitration logic for bus arbitration. 

3.4.1 SNOOPING LOGIC 

The MBC snooping logic is responsible for initiating 
a snoop in the 82495XP and providing the response 
to the rest of the system. Snoop logic must recog- 
nize what other caches are doing, and snoop if nec- 
essary. Snoop logic must also recognize when its 
82495XP is not capable of snooping and delay its 
snoop initiation. 

When a cycle begins on the bus, all other caches 
snoop. Once ail the snoop results are returned to 
the master 82495XP, its snoop logic must recognize 
the result and alter the cycle appropriately. This 
could mean aborting the current cycle In memory, 
delaying the cycle until a write-back is performed, or 
changing the master’s tag state according to the 
snoop information. 

3.4.2 CYCLE CONTROL LOGIC 

Cycle control logic Is responsible for initiating a 
memory bus cycle, providing proper 82495XP cycle 
attributes during the cycle, and terminating the cy- 
cle. Cycle control logic determines the cacheability 
of the cycle, whether cycles are allocatable, pipelin- 
ing, and ail aspects of the progress of the current 
cycle. 

Since cycle control logic interfaces memory bus sig- 
nals to the 82495XP, and since the memory bus is 
not necessarily synchronous to the 82495XP CLK, it 
may also provide proper synchronization. Careful 
design of this synchronization logic can minimize or 
eliminate synchronization penalties. 
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3.4.3 DATA PATH CONTROL 

Data path control logic controls how data is written 
from the 82490XP or read into the 82490XP and 
CPU. It handles the actual transferring of data to/ 
from the memory data bus. Data path control logic 
also handles the CPU burst order, and the holding of 
data during allocation cycles. In systems with memo- 
ry busses that are wider than the CPU bus, the data 
path control logic appropriately steers data to the 
correct 82490XP’s. 


3.5 Test 

The 82495XP/82490XP provide two means of 
cache testing. These are a built-in self-test, and 
boundary scan test. The built-in self-test (BIST) is 
initiated during RESET. The boundary scan test 
uses separate and dedicated pins on the 82495XP. 
These are described in a later chapter. 


4.0 CACHE CONSISTENCY 
PROTOCOL 

One of the 82495XP objectives is to implement a 
high performance second level cache for multipro- 
cessor systems. To fulfill this objective the 82495XP 
implements a “write-back” cache with full support 
for multiprocessing data consistency. Being a write- 
back cache means that the 82495XP may contain 
data which is not updated in the main memory. 
Therefore a mechanism is implemented to insure 
that data read by any system bus master, at any 
time, is correct. 

A key feature for multiprocessing systems Is reduc- 
tion of the memory bus utilization. The memory bus 
quickly becomes a resource bottleneck with the ad- 
dition of multiple processors. The 82495XP cache 
consistency mechanism insures minimal usage of 
memory bus bandwidth. 

The 82495XP allows portions of memory to be de- 
fined as non-cacheable. For the cacheable areas, 
the 82495XP allows selected portions to be defined 
as write-through locations. 

The 82495XP protocol is implemented by assigning 
state bits for each cached line. Those states are de- 
pendent on both 82495XP data transfer activities 
performed as the bus master, and snooping activi- 
ties performed in response to snoop requests gener- 
ated by other memory bus masters. 


4.1 Cache Consistency Protocol 
Model 

The 82495XP consistency protocol is the set of rules 
which allows the 82495XP to contain data that is not 
updated in main memory while ensuring that memo- 
ry accesses by other devices do not receive stale 
data. This consistency is accomplished by assigning 
a special consistency state to every cached entry 
(line) In the 82495XP. 


NOTE: 

The following rules apply to memory read and write 
cycles. All I/O and special cycles bypass the 
cache. 


The 82495XP protocol consists of 4 states. They de- 
fine whether a line Is valid (hit or miss). If It is avail- 
able in other caches (shared or exclusive), and if it Is 
modified (has been modified). 



The 4 States are: 

[I] - INVALID Indicates that the line is not avail- 
able in the cache. A read to this 
line will be a miss and cause the 
82495XP to execute a line fill 
(fetch the whole line and deposit 
it into the cache SRAM). A write 
to this line will cause the 
82495XP to execute a write- 
through cycle to the memory bus 
and in some circumstances initi- 
ate an ALLOCATION. 

[S] - SHARED This state indicates that this line 
is potentially shared with other 
caches (The same line may exist 
In more than, one cache). A 
Shared line can be read out of the 
cache SRAM without a main 
memory access. Writing to a 
Shared line updates the 
82495XP/82490XP cache, but 
also requires the 82495XP to 
generate a write-through cycle to 
the memory bus. In addition to 
updating main memory, the write- 
through cycle will invalidate this 
line in other caches. Since writing 
to a Shared line causes a write- 
through cycle, the system can en- 
force a “write-through policy” to 
selected addresses by forcing 
those addresses into the [S] 
state. This can be done by setting 
the PWT attribute in the CPU 
page table or asserting the 
MWB/WT # pin each time the ad- 
dress is referenced. 
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[E] - EXCLUSIVE This state indicates a line which is 
exclusively available in ONLY this 
cache, and that this line Is NOT 
MODIFIED (main memory also 
has a valid copy). Writing to an 
Exiusive line causes it to change 
to the Modified state and can be 
done without informing other 
caches, so no memory bus activi- 
ty is generated. 

[M] - MODIFIED This state indicates a line which is 
exclusively available in ONLY this 
cache, and is MODIFIED (main 
memory’s copy is stale). A 
Modified line can be updated lo- 
cally in the cache without acquir- 
ing the memory bus. Because a 
Modified line is the only up-to- 
date copy of data, it is the 
82495XP’s responsibility to flush 
this data to memory on accesses 
to it. Flushing of this data to mem- 
ory will be executed immediately 
after completion of the current 
CPU bus cycle. 


4.2 Basic State Transitions 

This section covers the most common, basic memo- 
ry accesses. The special functions which force a cy- 
cle to be noncacheable, locked, read only, or direct- 
to-Modified are not in use. These might be used, for 
example. In read for ownership and cache to cache 
transfers, and are covered In section 4.3. This basic 
transitions section is divided into two parts: the first 
covers MESI state changes which occur in a CPU/ 
cache core due to its own actions; the second de- 
scribes MESI state transitions in a CPU/cache core 
caused by the actions of other, external devices. 
Figure 4-1 shows a partial state diagram of the MESI 
coherency protocol which Includes these basic tran- 
sitions. 

The 82495XP accepts line attributes from the CPU 
and memory buses. The 82495XP assumes that all 
caches on the memory bus have the SAME number 
of bytes per line. 


4.2.1 TRANSITIONS IN CACHE STATES 

CAUSED BY OWN CPU TRANSACTIONS 

The MESI state of each 82495XP/82490XP cache 
line changes as the 82495XP/82490XP services the 
read and write requests generated by its CPU. 


4.2.1.1 Read Hit 

A read hit occurs when the CPU generates a read 
cycle on its bus, and the data is present in and re- 
turned by the 82495XP/82490XP. The state of the 
cache line (M, E, or S) remains unchanged by a read 
operation which hits the cache. 

4.2.1.2 Read Miss 

A read miss arises when the CPU generates a read, 
and the data is not present in the 
82495XP/82490XP cache— either the tag lookup 
does not produce a match or a match occurs but the 
data is Invalid. The 82495XP generates a memory 
access to fetch the data (which is assumed cache- 
able for this discussion) and the surrounding data 
needed to fill the cache line. This data is placed in 
the 82495XP/82490XP cache In an Invalid line or (if 
both valid) replaces the least recently used line, 
which is written back to memory if Modified. 

The new line is placed in the Exclusive state, unless 
either the CPU or memory indicates that it should be 
a write-through on its next write access using PWT 
or MWB/WT#, respectively. If either of these is as- 
serted, the new line Is placed in Shared state. A new 
line could also be read In and placed directly into 
Modified state: see section 4.3.4 for details and use. 


4.2.1.3 Write Hit 

When the CPU generates a write cycle. If the data is 
present in the 82495XP/82490XP cache, it is updat- 
ed and may undergo a MESI state change. 

If the hit line is originally in the Exclusive state, it 
changes to Modified state upon a write. If the hit line 
Is originally in the Modified state. It remains in that 
state. Neither of these cases generates any bus ac- 
tivity. 

A write to a line which is in the Shared state causes 
the 82495XP to write the data out to memory as well 
as update the 82495XP/82490XP cache. The write 
to main memory also serves to invalidate any copy 
of the data which resides in another cache. The 
cache line state changes according to activity on the 
PWT and MWB/WT # pins. If neither of these pins is 
asserted, the write hit line becomes Exclusive. If ei- 
ther of these pins is asserted, the line is forced to 
remain write-through, so the state remains Shared. 
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An existing line can also be written and forced di- 
rectly into Modified state: see section 4.3.4 for de- 
tails and use. 


4.2.1.4 Write Miss 

The CPU generates a write cycle, and the data is not 
present in the 82495XP/82490XP cache. In a simple 
write miss, the 82495XP/82490XP assists CPU in 
delivering data to memory, but the data is not placed 
In the cache. No cache lines are affected, so no 
state changes take place. 


4.2.1.5 Write Miss with Allocate 

This Is a special case of a write miss where the 
memory location written by the CPU is not currently 
in the 82495XP/82490XP cache, but is brought into 
the cache and updated. Like a regular write miss, 
the 82495XP/82490XP assists the CPU in writing 
the data out to main memory. After the data is writ- 
ten to memory, the 82495XP/82490XP reads back 
the same data following the rules of a read miss, 
above. 


4.2.2. 1 Snooping 

The master which Is accessing data from memory 
on the bus sends a request to all caching devices on 
the bus (snoopers) that they check or snoop their 
caches for a more recently updated version of the 
data being accessed. If one of the snoopers has a 
copy of the requested data, It is termed a “snoop 
hit”. 


If a snooper has a modified version of the data 
(“snoop hit to a Modified line”), it proceeds to gener- 
ate an “inquire cycle” to the i860 XP CPU, asking 
the i860 XP CPU If it also has a Modified copy of the 
line (which would be more recently modified than the 
82495XP/82490XP’s version). The most up-to-date 
line is written out by the snooping 
82495XP/82490XP to the bus (to main memory or 
directly to the requesting master) so that the re- 
questing master can utilize It. 



The changes In MESI protocol state in a snooping 
cache which has a snoop hit depend on attribute 
inputs SNPINV and SNPNCA, which are driven by 
the master. 


The ability to perform an allocation depends on all of 
the following conditions: 

the write is cacheable 

PWT Is not asserted, forcing write-through 

the write Is not LOCKed 

the write is to memory (not to I/O) 

4.2.2 TRANSITIONS CAUSED BY OTHER 
DEVICES ON BUS 

MESI state transitions in the 82495XP/82490XP 
cache of one core (CPU/82495XP/82490XP) can 
be induced by actions Initiated by other cores or de- 
vices on the shared memory bus. In the following, 
the 82495XP which Is responding to actions of other 
devices does not currently own the bus, and may be 
referred to as a “slave” or, in the case of snooping, 
a “snooper”. The device which currently owns the 
bus is the “master”. 


The SNPINV input tells a snooping 
82495XP/82490XP to invalidate the line being 
snooped If hit: the master requesting the snoop is 
about to write to its copy of this line and will there- 
fore have the most up-to-date copy. When. SNPINV 
Is asserted on the snoop request, any snoop hit is 
placed in Invalid state, and a “back Invalidation” is 
generated which instructs the CPU to check Its 
cache and likewise invalidate a copy of the line. 
When the snooping 82495XP has a snoop hit to a 
Modified line and SNPINV was asserted by the bus 
master, the back invalidate Is combined with the In- 
quire cycle. 

The SNPNCA input tells a snooping 
82495XP/82490XP whether the requesting master 
is performing a Non-Caching Access. If the request- 
ing master is not caching the data, a snoop hit to a 
Modified or Exclusive line can be placed in the 
Exclusive state: since the requester isn’t caching the 
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line, if the snooper has a future write hit to the line, 
an invalidation does not have to be broadcast. If the 
requesting master is caching the data, then a snoop 
hit to a Modified or Exclusive line must be placed in 
the Shared state, which insures that a future write hit 
causes an invalidation to other caches. Note that a 
snoop hit to a Shared line must remain in the Shared 
state regardless of SNPNCA. Also note that an as- 
serted SNPINV always overrides SNPNCA. 


4.2.2.2 Cache Synchronization 

Cache synchronization is performed to bring the 
main memory up-to-date with respect to the 
82495XP/82490XP. Two devices exist in the 
82495XP/82490XP to accomplish this: FLUSH and 
SYNC. 

A cache flush is initiated by asserting the 82495XP 
FLUSH# pin. Once initiated, the 82495XP writes all 
Modified lines out to main memory, performing back 
invalidations and inquire cycles on the CPU. When 
completed, all 82495XP/82490XP and CPU cache 
entries will be in the Invalid state. 

Activation of the SYNC# pin also causes all of the 
82495XP’s Modified lines to be written to memory. 
Unlike the FLUSH# pin, the cache lines remain valid 
after the SYNCH# process has completed, with 
Modified lines changing to the Exclusive state. 


4.3 The Effects of Special Cycles on 
MESI States 

4.3.1 NON-CACHEABLE ACCESS 

The 82495XP allows cacheabllity to be determined 
on both a per page and per line basis. The page 
cacheabllity function Is determined by software, 
while cacheabllity on a line-by-line basis is driven by 
hardware. 

The PCD (Page Caching Disabled) pin is a 82495XP 
input driven by the CPU’s PCD output, which corre- 
sponds to a cacheabllity bit in the page table entry of 
a memory location’s virtual address. If the PCD bit is 
asserted when the CPU presents a memory ad- 
dress, that location will not be cached in either the 
82495XP or the CPU. 

MKEN# is a 82495XP input which connects to the 
memory bus controller or the memory bus. MKEN# 
inactive prevents the caching of the memory loca- 
tion in both the 82495XP and the CPU, affecting only 
the current access. 

If a read miss is indicated non-cacheable by either of 
these, the line Is not placed in the 
82495XP/82490XP or CPU cache, and no cache 
states are modified. On a write miss, a noncachea- 
ble indication from either input forces a write miss 


SNOOP»INV + FLUSH 



Figure 4-1. Major State Transitions 
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without allocation. Note that if the 82495XP/ 
82490XP already has a valid copy of the line, the 
PCD attribute from the CPU is ignored. 

4.3.2 READ ONLY ACCESSES: MRO# 

The MRO# (Memory Read Only) input is driven by 
the memory bus to indicate that a memory location 
is read only. 

When asserted during a read miss line fill, MRO# 
causes the line to be placed in the 
82495XP/82490XP cache in the Shared state and 
also sets a read-only bit in the cache tag. MRO# 
accesses are not cached in the CPU. On subse- 
quent write hits to a read-only line, the write is actu- 
ally written through to memory without updating the 
82495XP/82490XP line, which remains in the 
Shared state with the read-only bit set. 


4.3.3 LOCKED ACCESSES: LOCK# 

The LOCK# signal driven by the CPU indicates that 
the requested cycle should lock the memory loca- 
tion for an atomic memory access. Because locked 
cycles are used for interprocessor and intertask syn- 
chronization, all locked cycles will appear on the 
memory' bus. 


4.3.4 FORCING LINES DIRECT-TO-MODIFIED: 
DRCTM# 

The DRCTM# (Direct To Modified) pin is an input 
which informs the 82495XP to skip the Exclusive 
state and place a line directly in the Modified state. 
The signal can be asserted during 
82495XP/82490XP reads of the memory for special 
82495XP/82490XP data accesses like read-for- 
ownership and cache-to-cache-transfer. The signal 
can also be asserted during writes, for purposes of 
cache tracking. 


4.4 State Tables 


Lines cached by the 82495XP can change states as 
a result of either the CPU bus activity (that some- 
times require the 82495XP to become a memory bus 
master) or as a result of memory bus activity gener- 
ated by other system masters (snooping). 



State transitions are affected by the type of CPU/ 
memory bus transactions (reads, writes) and by a 
set of external input signals and Internally generated 
variables. In addition, the 82495XP will drive certain 
CPU /memory bus signals as a result of the consist- 
ency protocol. 


On a locked write, the 82495XP treats the access as 
a write-through cycle, sending the data to the memo- 
ry bus — updating memory and invalidating other 
cached copies. If the data is also present in the 
82495XP/82490XP cache, it is updated but its M, E, 
or S state remains unchanged. 

For locked reads, the 82495XP assumes a cache 
miss and starts a memory read cycle. If the data 
resides in the 82495XP/82490XP, the M-E-S state 
of the data remains unchanged. If the requested 
data is in the 82495XP/82490XP and is in the 
Modified state when the memory bus returns data, 
the 82495XP will use the 82490XP data and ignore 
the memory bus data. 

LOCKed read and write cycles which miss the 
82495XP/82490XP cache are noncacheable in both 
the 82495XP/82490XP and CPU. 


4.4.1 CPU BUS 

— PWT (Page Write Through, PWT Input pin) Indi- 
cates a CPU bus write-through request. Activat- 
ed by the I860 XP CPU PWT pin. This signal af- 
fects line fills and will cause a line to be put in the 
[S] state if active. The 82495XP will NOT exe- 
cute ALLOCATIONS (line fills triggered by a 
write) for write-through lines. If PWT is asserted, 
it overrides a write-back indication on the 
MWB/WT# pin. 

— PCD (Page Cacheability Disable, PCD input pin): 
indicates that the accessed line is noncachea- 
ble. If PCD is asserted, it overrides a cacheable 
Indication from an asserted MKEN#. 

— NWT (i860 XP CPU Write-Through Indication, 
82495XP’s WB/WT# Output Pin): When low 
forces the i860 XP CPU to keep the accessed 
line into the SHARED state. 
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Write back mode (WB = 1) will be indicated by 
the !NWT notation. In those cases the i860 XP 
CPU is allowed to go into exclusive states [E], 
[M] . NWT is normally active unless explicitly stat- 
ed. 

— KEN (CPU caching enable, KEN# output pin): 
When active Indicates that the requested line 
can be cached by the CPU 1st level cache. KEN 
is normally active unless explicitly stated. 

4.4.2 MEMORY BUS 

— MWT (Memory Bus Write-Through Indication, 
MWB/WT # Input Pin): When active force? the 
82495XP to keep the accessed line Into the 
SHARED state. Write back mode (MWB = 1) will 
be indicated by the !MWT notation. In those cas- 
es the 82495XP is allowed to go into exclusive 
states [E], [M]. 

— DRCTM (Memory Bus Direct To [M] indication, 
DRCTM# Input Pin); When active forces skip- 
ping of the [E] state and direct transfer to [M]. 

MKEN (Memory Bus Cacheability Enable, 
MKEN# Input pin): When Active Indicates that 
the memory bus cycle Is cacheable. 

— MRO (Memory Bus Read-Only Indication, 
MRO# Input Pin): When Active forces line to be 
READ-ONLY. 

— MTHIT (Tag Hit, MTHIT# Output pin): Activated 
by the 82495XP during snoop cycles and indi- 
cates that the current snooped address hits the 
82495XP cache. 

— MHITM (Hit to a line in the [M] State, MHITM# 
Output pin): Activated by the 82495XP during 
snoop cycles and indicates that the current 
snooped address hits a modified line in the 
82495XP cache. 

— SNPNCA (Non Caching device access): When 
active indicates to the 82495XP that the current 
bus master is a non-caching device. 

— SNPINV (Invalidation): When active indicates to 
the 82495XP that the current snoop cycle will 
invalidate that address. 


4.4.3 TAG STATE 

— TRO (Tag Read Only, 82495XP Tag bit): This bit 
when set indicates that the 1 or 2 lines associat- 
ed with this tag are Read-Only lines. 


As a function of State Changes the 82495XP 
may execute the following cycles: 

— BINV: Execution of a CPU Back Invalidation Cy- 
cle (Snoop with INV active) 

— INQR: Execution of a I860 XP CPU Inquire 
Cycle(i). 

— WBCK: 82495XP Write-Back Cycle. This is a 
Memory Bus write cycle generated by the 
82495XP when MODIFIED data cached in the 
82495XP needs to be copied back into main 
memory. A write-back cycle affects a complete 
82495XP line. 

— WTHR: 82495XP Write Through Cycle. This is a 
system write cycle in response to a processor 
write. It may or may not affect the cache SRAM 
(update). In a write-through cycle, the 82495XP 
drives the Memory Bus with the same Address, 
Data and Control signals as the CPU does on the 
CPU Bus. Main Memory is updated, and other 
Caches invalidate their copies. 

— RTHR: 82495XP Read Through cycle. This Is a 
special cycle to support locked reads to lines 
that hit the 82495XP cache. The 82495XP will 
request a Memory Bus cycle for lock synchroni- 
zation reasons, data will be supplied from the 
BUS except for [M] state, which will have data 
supplied from the CACHE. 

— LFIL: 82495XP Cache line fill. 82495XP will gen- 
erate Memory Bus cycles to fetch a new line and 
deposit Into the cache. 

— RNRM: 82495XP Read Normal Cycle: This Is a 
normal read cycle which will be executed by the 
82495XP for non-cacheable accesses. 

— SRUP: 82495XP SRAM UPDATE. Occurs any 
time new information is placed in the 82495XP 
cache. An SRAM update is implied in the LFIL 
cycle. 

— ALLOC: 82495XP ALLOCATION. Write Miss cy- 
cle that has determined to be cacheable so the 
82495XP issues a line read. 

NOTE: 

1. An Inquire cycle may be executed with INV ac- 
tive, performing a back-invalidation simultaneously. 
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Table 4-1. Master 82495XP Read Cycle 


Pres. 

State 

Condition: Next State 

Mem 

Bus 

Activity 

CPU 

Bus 

Activity 

Comments 

M 

!LOCK: M 

— 

INWT 

Normal Read Hit [M] 


LOCK: M 

RTHR 

!KEN 

Read Through Cycle, Data From 
Array 


!LOCK: E 

— 

NWT 

Normal Read Hit [E] 


LOCK: E 

RTHR 

!KEN 

Read Through Cycle, Data From 
Memory 


!LOCK.!TRO: S 

— 

NWT 

Norma! Read Hit [S] 


ILOCK.TRO: S 


!KEN 

Normal Read to Read-Only 
sector. Stays in [S] state and 
deactivate KEN to prevent CPU 
from caching line 

■ 

LOCK: S 

RTHR 

!KEN 

Read Through Cycle, Data from 
Memory 

1 

PCD +IMKEN + LOCK: 1 

RNRM 

!KEN 

Non-Cacheable Read, Locked 
cycles 


!PCD.MKEN.!LOCK.MRO:S 

LFIL 

!KEN 

Cacheable read, Read-Only. Fill 
line to 82495XP. Do not allow 

CPU to cache line by deactivating 
KEN # . Set the 82495XP’s TRO 
bit to indicate the sector read only 
attribute 


!PCD.MKEN.!LOCK.!MRO.(PWT+MWT):S 

LFIL 

NWT 

Cacheable Reads, forced Write- 
Through 


!PCD.MKEN.!LOCK.!MRO.!PWT.!MWT.!DRCTM:E 

LFIL 

NWT 

Line not shared, thus enabling the 
82495XP to move into tan 
exclusive state 


!PCD.MKEN.!LOCK.!MRO.!PWT.!MWT.DRCTM:M 

LFIL 

NWT 

As before with direct [M] state 
transfer. Keep i860 XP CPU in 

Write Through mode 
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Table 4-2. Master 82495XP Write Cycle 


Pres. 

State 

Condition: Next State 

Mem 

Bus 

Activity 

CPU 

Bus 

Activity 

Comments 

M 

ILOCK: M 

" 

SRUP, 

!NWT 

Write hit. Write to cache. Allow i860 XP 
CPU to perform internal write cycles 
(Enter inter [E], [M] states). 


LOCK: M 

WTHR 

SRUP. 

!NWT 

Locked Cycle. Write-Through updating 
cache SRAM. Most updated copy of the 
line is still owned by 82495XP. All 

Locked write cycles are posted. 

E 

ILOCK: M 

- 

SRUP, 

!NWT 

Write hit. Update SRAM. Let i860 XP 

CPU execute internal write cycles. 


LOCK: E 

WTHR 

SRUP, 

NWT 

Lock forces cycle to memory bus. Main 
memory remains updated. 


TRO:S 

WTHR 

" 

Read-Only. Write cycle with write 
through attribute from CPU or Memory 
Bus. Locked Cycles. 


!TRO.(PWT + MWT + LOCK): S 

WTHR 

SRUP, 

NWT 

Not Read-Only. Write cycle with write 
through attribute from CPU or Memory 
Bus. Locked Cycles. 


ITRO.IPWT.ILOCK.IMWT.IDRCTM: E 

WTHR 

SRUP, 

NWT 

Not Read-Only. No write-through cycle, 
no lock request allow going into 
exclusive state. 


ITRO.IPWT.ILOCK.IMWT.DRCTM: M 

WTHR 

SRUP, 

NWT 

Not Read-Only. No write-through cycle, 
no lock request allow going into 
exclusive state. DRCTM forces final 
state to M. 

1 

PCD + IMKEN + PWT + LOCK + MRO: 1 

WTHR 

- 

Write Miss Non-Cacheable, Write- 
Through, locked cycle or Read-Only. 


IPCD.MKEN.IPWT.ILOCK.IMRO: 1 

!PCD.MKEN.!PWT.!LOCK.MRO:S 

Allocation Final State 

MWT:S 

!MWT.!DRCTM:E 

!MWT.DRCTM;M 

WTHR, 

LFIL 

ALLOC 


Write Mis with allocation. After the write 
cycle, a line fill (allocation) is scheduled. 

If MKEN and MRO are asserted, an 
allocation to the [S] state will occur 
Allocation final state as a function of 
line fill attributes. 


NOTE: 

The WB/WT# pin will only be activated for 82495XP lines that are in the [M] state. In this state, the 82495XP always 
assumes that the line owner MAY be the i860 XP CPU. On all other states the i860 XP CPU will be forced to perform Write- 
Through cycles. This mechanism will make sure that any i860 XP CPU write cycle is seen at least once on the CPU Bus. 
Allocations, which are consequences of write-misses, will disregard the MKEN# and MRO# attributes during the line fill. In 
other words, once an allocation is scheduled, it cannot be cancelled. 
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Table 4-3. Snooping 82495XP without Invalidation Request 


Pres. 

State 

Condition: 
Next State 

Mem 

Bus 

Activity 

CPU 

Bus 

Activity 

[— 

Comments 

M 

ISNPNCA: S 
SNPNCA: E 

MTHIT 

MHITM 

WBCK 

INQR 

Snoop hit to modified line. 82495XP indicates tag hit and 
modified hit. 82495XP schedules flushing of the modified 
line to memory. If non-cacheable device, stay in [E] state. 

E 

ISNPNCA: S 
SNPNCA: E 

MTHIT 

", 

If snooping by cacheable device, Indicate MTHIT and go 
to shared state. If no caching device only indicate MTHIT, 
stay exclusive. 

S 

S 

MTHIT 

- 


1 

1 

- 

- 



NOTE: 

Usage of DRCTM# to avoid [E] states may be in conflict with the SNPCNA cycle attribute. Note in the table that snoops 
with SNPNCA may cause an [E] state transition. 


Table 4-4. Snooping 82495XP with Invaiidation Request 


Pres. 

State 

Next State 

Mem 

Bus 

Activity 

CPU 

Bus 

Activity 

Comments 

M 

1 

MTHIT 

MHITM 

WBCK 

INQR, 

BINV 

Snoop hit to modified line. 82495XP indicates tag hit and 
modified hit. 82495XP schedules flushing of the modified 
line to memory. Invalidate CPU. 

E 

1 

MTHIT 

BINV 

Inidicate tag hit, infalidate 82495XP, CPU lines. 

S 

1 

MTHIT 

BINV 

Same as before 

1 

1 

- 

- 



Table 4-5. SYNC Cycles 


Pres. 

State 

Next State 

Mem 

Bus 

Activity 

CPU 

Bus 

Activity 

Comments 


E 

E 


INQR 

Get modified data from I860 XP CPU, flush to memory 

E 

E 

- 

- 

Memory already synchronized 


S 

- 

- 

Memory already synchronized 

1 

1 

- 

- 
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Table 4-6. FLUSH Cycles 


Pres. 

State 

Next State 

Mem 

Bus 

Activity 

CPU 

Bus 

Activity 

Comments 

M 

1 

WBCK 

INQR, 

BINV 

Flush and invalidate i860TM xP CPU 

E 

1 

— 


Invalidate I860 XP CPU 

S 

1 

— - 

BINV 

Invalidate I860 XP CPU 

1 

1 

— 

— 



NOTE: 

Usage of DRCTM # to avoid [E] states may be in conflict with the SYNC cycle. Note in the table that SYNC cycles move an 
[M] state line to [E]. 


5.0 CONFIGURATIONS 

The 82495XP/82490XP cache system was de- 
signed to fit a variety of applications. For the great- 
est performance, each application requires the 
82495XP/82490XP to be configured differently. The 
82495XP/82490XP therefore has many possible 
configurations that are set on RESET and affect the 
82495XP/82490XP architecture, operation, and 
electrical characteristics. 


5.1 Physical Cache 

The physical configurations of the 82495XP/ 
82490XP consist of parameters that alter the 
82495XP/82490XP basic architecture. These are 


line ratio, tag size, lines per sector, bus width, and 
cache size. These parameters are sampled at the 
falling edge of RESET and are not dynamically 
changeable. 

Because of physical cache constraints, choosing 
one parameter limits the flexibility of other parame- 
ters. The following table summarizes the possible 
i860 XP CPU basic cache configurations. CFGO- 
CFG2 are multiplexed to select one of 5 possible 
line ratio/tag size/lines per sector configurations. 
This information is automatically passed from the 
82495XP to 82490XP during RESET. CFG0-CFG3 
must be valid at least 10 clocks before RESET’S fall- 
ing edge. ^ 


MEM BUS 

= 64 Bits 

MEM BUS 

= 128 Bits 

Number of 

4 Trans. 

8 Trans. 

4 Trans. 

8 Trans. 

82490XP Devices 

1 

LR = 1 

Tags = 8k 

US = 1 

2 

LR = 2 

Tags = 4k 

US = 1 



8 

3 

LR = 1 

Tags = 8k 

L/S = 2 

4 

LR = 2 

Tags = 8k 

US = 1 

4 

LR = 2 

Tags = 8k 

US = 1 

5 

LR = 4 

Tags = 4k 

US = 1 

16 


Not Supported LR = 82495XP/CPU Line Ratio 
US - 82495XP Lines/Sector 


Cache Device 
2. 4. 8 Bits Wide 


Figure 5-1. 82495XP/82490XP Configurations 
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5.1.1 LINE RATIO (LR) 

Line Ratio (LR) is the ratio of the 82495XP/82490XP 
cache line size to the CPU cache line size. For ex- 
ample, If LR = 2 then the 82495XP/82490XP line 
size Is 64 bytes. This information is also used to de- 
termine the number of back invalidations or inquire 
cycles to the I860 XP CPU. 

5.1.2 TAG SIZE (TAGS) 

The 82495XP/82490XP cache tag size may be 4K 
or 8K tag entries. By reducing tag size, the line ratio 
(LR) can be doubled without a change in cache size. 

5.1.3 LINES PER SECTOR (L/S) 

The 82495XP/82490XP may be non-sectored (L/S 
= 1) or contain two lines per sector (L/S = 2). If 
L/S = 2, then the 82495XP contains one tag for two 
consecutive cache lines and each cache line has Its 
own set of MESI state bits. This allows just one line 
to be filled on replacements or written back on 
snoop hits. Both lines are written back during re- 
placements, if both are modified. 


5.1.4 BUS SIZE 

The 82495XP/82490XP supports 64 and 128 bit 
memory bus widths for the i860 XP CPU. 

5.1.5 CACHE SIZE 

The 82495XP/82490XP may be configured to be 
256K or 51 2K. Cache size is a direct result of the 
number of 82490XP devices used. It takes 8 
82490XP’s to make a 256K byte cache and 16 
82490XP’s for a 512K cache. 


5.1.6 FUNCTION AND ADDRESS 
CONNECTIONS (CFA0-CFA6) 

The following table lists which address lines should 
be connected to each of the CFA0-CFA6 lines for 
each cache configuration. CFA0-CFA6 provide the 
82495XP with proper multiplexed addresses for 
each of the possible cache configurations. Depend- 
ing on the mode selected, either CFA5 or CFA4 will 
operate as the 82495XP’s CTYP input. This Input is 
connected to the I860 XP CPU’s CTYP output. 
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Table 5-2. CFA Address Connections 


Cfig 

No. 

Line 

Ratio 

Lines/ 

sec 

No. Of 
Tags 

CFA6 

CFA5 

CFA4 

CFA3 

CFA2 

CFA1 

CFAO 

■ 1 

1 

1 


A5 

CTYP 

A31 


A29 

■■ 

A3 

2 

2 

1 


A5 

CTYP 

A31 


A29 

■■ 



1 

2 


A6 

A5 

CTYP 

A31 

A30 

A4 

A3 

4 

2 

1 

8K 

A6 

A45 

CTYP 

A31 

A30 

A4 

A3 

5 

4 

1 

4K 

A6 

A5 

CTYP 

A31 

A30 

A4 

A3 


5.2 Cache Modes 

Cache modes are ways of configuring the 
82495XP/82490XP to operate differently. These op- 
tions are all sampled at RESET and are not dynami- 
cally changeable. If some of these configuration op- 
tions share a pin, such as the 82495XP’s SYNC# 
and MEMLDRV, the configuration option must meet 
a specific setup and hold time to RESET’s falling 
edge. For the 82495XP, setup time is usually 4 
clocks, and for the 82490XP, setup time is usually 1 
clock. For both parts, the configuration option must 
be held until RESET is detected low. 



Figure 5-2. Configuration Input Sampling 


5.2.1 MEMORY BUS MODES 

The 82495XP/82490XP may be configured to have 
a clocked or strobed memory bus. Memory bus 
mode Is selected by the 82490XP MSTBM pin (same 
as MCLK pin). If MSTBM Is strapped high, the 
82490XP’s operate in strobed mode. If MSTBM Is 
toggling, ie it is connected to the memory bus clock, 
the 82490XP operates in clocked mode. MCLK need 
not be synchronous to CLK. 


5.2.2 SNOOPING MODES 

The 82495XP/82490XP supports three snooping 
modes: synchronous, clocked, and strobed. Snoop- 
ing mode is selected by the SNPMD (same as 
SNPCLK) pin. If SNPMD is low the 82495XP snoops 
synchronously. If SNPMD is high the 82495XP 
snoops In strobed mode. If SNPMD is toggling, 
clocked mode is selected and SNPMD becomes a 
snoop clock source, SNPCLK, which clocks in the 
snoop requests. 
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These three snooping modes only alter the way the 
memory bus controller may initiate a snoop request 
to the 82495XP. The 82495XP response is always 
synchronous to the CPU CLK. 


5.2.3 BUS DRIVERS 

The 82495XP/82490XP provide 2 types of memory 
bus drivers; High capacitance drivers and low capac- 
itance drivers. The high capacitance drivers are se- 
lected by driving both the 82495XP and 82490XP 
MEMLDRV pins low at RESET. Similarly, the low ca- 
pacitance drivers are selected with MEMLDRV high. 

With C490LDRV the 82495XP also provides two 
types of drivers when driving the 82490XP’s. Refer 
to the interface document to determine C490LDRV. 


5.2.4 STRONG/WEAK WRITE ORDERING 

If the 82495XP pin WWOR# is sampled low at 
RESET, the 82495XP enforces weak write-ordering. 
If sampled high, the 82495XP enforces strong write- 
ordering. Strong write-ordering prevents the 
82495XP from completing a write cycle that would 
go to ’M’ state if a posted write is pending (has not 
been granted the bus with BGT#). By doing this, 
strong ordering ensures that write cycles from the 
CPU are written to memory in the same order that 
they appear in the i86Q XP CPU program. 


5.2.5 iSeOTM XP CPU PFLD SUPPORT 

The i860 XP microprocessor executes PFLD (Pipe- 
lined Floating-Point Load) instructions to implement 
special data handling, typically for vector operations. 
This instruction allows loading of data through a 
FIFO pipeline, to hide memory latency. The i860 XP 
CPU does not cache data returned by a PFLD cycle. 

The 82495XP can be configured to decode the 
i860 XP microprocessor’s PFLD cycles. The 
82495XP supports 3 operational modes for PFLD 
cycle decoding: 

Mode #1. PFLD cycles are cached in the 82495XP. 

This mode is used in applications that 
can fit entirely In the 82495XP/82490XP 
cache. The 82495XP treats PFLD cycles 
as normal read cycles. 


Mode #2. PFLD cycles are not cached in the 
82495XP, without an external PFLD ex- 
tension FIFO. 

This mode is used when applications are 
too large to fit in the 82495XP/82490XP 
cache. The 82495XP treats PFLD cycles 
as noncacheable, using the same proto- 
col as cycles with PCD = 1 (if data is al- 
ready cached, it will be supplied from the 
cache). 

Mode #3. PFLD cycles not cached in the 82495XP, 
with an external PFLD extension FIFO. 


This mode allows the PFLD FIFO to be 
extended beyond the three stages built 
into the i860 XP CPU by adding external 
FIFO hardware. The 82495XP, treats 
PFLD cycles in the same manner as its 
treatment of LOCKed cycles (all cycles 
go to the bus, even if data already pres- 
ent In cache). To support the external 
FIFO, the 82495XP Identifies PFLD cy- 
cles by asserting Its FPFLD output. For 
proper operation, data which can be ac- 
cessed by PFLD must never be in the 
cache in the Modified state, and software 
must be aware of the length of the com- 
bined PFLD pipeline. Because this mode 
is not software transparent, it must be 
used with extreme care. 



The choice of PFLD mode is largely application de- 
pendent. The PFLD mode of the 82495XP Is select- 
ed by configuration pins FPFLDEN and NCPFLD#, 
which are sampled at RESET. FPFLDEN shares a 
pin with FPFLD, and NCPFLD# shares a pin with 
FLUSH#. Depending on the PFLD mode, data for 
reads will either be supplied to the CPU from the 
82495XP, or from the memory bus. Table 5-3 sum- 
marizes, the 82495XP’s support for i860 XP CPU 
PFLD cycles. 
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Table S-3. 82495XP PFLD Modes 


Mode# 

FPFLDEN 

NCPFLD# 

Data Supplied From 

Line Fill 

[1] 

[S] 

[E] 

[M] 

on [1] 

1 

0 

1 

MEMBUS 

CACHE 



Yes 

2 

0 

0 

MEMBUS 

CACHE 

CACHE 


No 

3 

1 

1 

MEMBUS 

MEMBUS 

MEMBUS 

MEMBUS 

No 

X 

1 

0 

Illegal Mode 


5.3 82490XP Bus Configuration 

The 82490XP needs to be configured so it knows to 
drive 4 or 8 MDATA lines and whether it should do 4 
or 8 memory transfers per line fill. This is done 
through the MX4/MX8# and the MTR4/MTR8# 
configuration inputs. For a given line ratio (memory 
bus line size / CPU line size), they should be sam- 
pled as follows: 


Table 5-4. MX/MTR Configurations 


Line 

Ratio 

MX4/ 

MX8# 

MTR4/ 

MTR8# 

Membus 

I/O 

CPUbus 

I/O 

. 1 

1 

1 

4 

4 

2 

1 

0 

4 

4 

2 

0 

1 

8 

4 

4 

0 

0 

8 

4 

1 

0 

1 

8 

8 

2 

0 

0 

8 

8 


5.3.1 82490XP PARITY CONFIGURATION 

A 82490XP may be designated as a parity device. 
This is done by strapping the PAR# pin low. In this 
configuration CDATA[0:3] are used to store 4 parity 
bits, and CDATA[4:7] are used as 4 bit enables. The 
four bit enables allow the writing of individual parity 
bits. 

Every mode and configuration of a non-parity 
82490XP may be used and selected on the parity 
82490XP device. The 82490XP parity configurations 
are as follows: 


Table 5-5. Parity Configurations 


Cache 

Size 

Memory 

Bus 

Width 

Number 
of Parity 
Devices 

82490XP 

I/O bits 
(CPU/Mem) 

256K 

64 

2 

4/4 

51 2K 

128 

2 

4/8 


5.3.2 CPU 82490XP ADDRESS 
CONFIGURATIONS 

The 82490XP Address inputs (A) are multiplexed to 
the CPU address lines (CA) according to the cache 
size: 


Table 5-6. 82490XP Address Connections 


Size 

82490XP Address Pins 

A15 

A14 

A13 

A12 

A11 

A10 

A9 

A8 

A7 

A6 

A5 

A4 

A3 

A2 

A1 

AO 

256K 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

Vss 


16 

15 

14 

13 

12 

11 

10 

9 

8 

7 

6 

5 

4 

3 



51 2K 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 

CA 


17 

16 

15 

14 

13 

12 


10 

9 

8 

7 

6 

5 

4 

3 

2 


NC = No Connect. 
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6.0 CACHE OPERATION 



Figure 6-1. Memory Bus Controller Interface Model 


Figure 6-1 shows the memory bus controller (MBC) 
interface model. The memory bus controller Interfac- 
es to the i860 XP CPU, 82495XP, 82490XP, and 
memory bus. The MBC interface was defined with a 
minimal set of assumptions as to the memory bus 
implementation. The chipset was designed to enable 
flexibility in the design of a memory bus and control- 
ler. 


The 82495XP requests control of the memory bus 
by signalling the memory bus controller. The memo- 
ry bus controller is resporisible for arbitrating and 
granting the bus to the 82495XP. Once granted, the 
memory bus controller is responsible for executing 
the requested cycle, snooping the other caches, and 
ending the cycle. The 82495XP supports different 
modes of snooping, different modes of memory bus 
operation, and various special cycles. Memory Bus 
Controller design dictates which of these features 
are used, and exactly how they are used. 
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6.1 Cycle Attribute and Progress 


CADS#, SNPADS# 
CDTS# 


Cycle Request 


B6T# 

KWEND#(ATTRIB: MKEN#, MRO#) 
SWEND#(ATTRIB: MWB/WT#, DRCTM#) 
CNA# 

CRDY# 


Cycle Progress 


240956-12 


Figure 6-2. Cycle Attribute and Progress Signals 

CADS# indicates the start of the cycle address 
phase. CDTS# tracks CADS# and indicates the 
start of the cycle data phase. For READ cycles it 
indicates that starting in the next CLK the CPU data 
bus is in read mode under the control of the MBC 
until the last BRDY#. In Read cycles, if the MBC 
already owns the CPU data bus, CDTS# will be acti- 
vated with CADS#. For ALLOCATE cycles the MBC 
does not need the CPU data bus, therefore CDTS# 
Is activated together with CADS#. 


For Write cycles CDTS# indicates that the 1st piece 
of data Is available on the memory bus. For write- 
I back cycles CDTS# indicates that all data Is avail- 
able (write-back buffer or snoop buffer loaded with 
correct write-back data). 


After BGT # the memory bus controller owns the cy- 
cle. The 82495XP assumes the cycle will terminate 
and will not re-issue It on snoop-write-backs. Follow- 
ing BGT # comes KWEND# which indicates that the 
cacheabillty window is closed and that the 82495XP 
can sample MKEN#, MRO# attributes. Those indi- 
cate to the 82495XP cacheabillty and read-only re- 
spectively. These attributes can be determined by 
decoding the 82495XP address. Based on those at- 
tributes the 82495XP executes ALLOCATIONS, 
Line-fills, Replacements, etc. 

Following KWEND#, SWEND# is activated. It indi- 
cates that the Snoop Window is closed. The 
82495XP samples MWB/WT# and DRCTM# attri- 
butes. These attributes are determined by snooping 
the other caches in the system. At this point the 
82495XP updates its TAG RAM state related to the 
line access in progress. 

Lastly the MBC issues CRDY#, which Indicates to 
the 82495XP the end of the transaction data phase. 

The 82495XP allows memory bus pipelining by pro- 
viding CNA# which allows the MBC to request a 
new address phase before the conclusion of the cur- 
rent data phase. The 82495XP supports a 1 level 
deep address pipeline on the Memory Bus. 


6.2 Snoop Operations 


As a response to the cycle request, the memory bus 
controller responds with cycle progress signals. All 
cycle progress signals are sampled .ONCE in specif- 
ic windows and then ignored until CRDY# of the 
corresponding cycle. BGT # indicates a commitment 
by the memory bus controller to complete the cycle 
execution on the memory bus. Up until this point the 
82495XP owns the cycle. This means that interven- 
ing snoop-write-backs will abort it and the 82495XP 
re-issues the cycle to the MBC. There is only one 
case where the 82495XP will issue a new, not a re- 
issued, cycle; if the original CADS# operation Is a 
write-back cycle, and the interrupting snoop cycle 
hits that write-back buffer, then the subsequent 
CADS# will be for a completely new cycle (not a re- 
issuing of the interrupted CADS# operation). 


The 82495XP provides the capability of snooping 
operations on the memory bus to ensure cache con- 
sistency. A snoop operation consists of two phases: 
1) Initiation phase and 2) response phase. 


Initiation ) > 

Response y 

240956-13 


Figure 6-3. 82495XP Snooping Operations 

During the initiation phase the MBC provides the 
82495XP with the snoop address information. During 
the response phase the 82495XP provides the 
snoop status information. 
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6.2.1 SNOOP INITIATION PHASE 

The 82495XP provides three modes for initiating 

snoops: 

1. Strobed: the falling edge of SNPSTB# is used. 

2. Clocked: SNPSTB# is sampled with SNPCLK. 

3. Synchronous: SNPSTB# is sampled with CLK. 

These three snooping modes are configured as fol- 
lows: 

1. Strobed: The SNPCLK[SNPMD] signal must be 
strapped high. 

2. Clocked: The SNPCLK[SNPMD] signal must be 
connected to the snoop clock source. 

3. Synchronous: The SNPCLK [SNPMD] signal 
must be strapped low. 


NOTE: 

The 82495XP samples the SNPCLK[SNPMD] sig- 
nal at the falling edge of RESET to determine the 
snoop mode. If a rising edge occurs on the 
SNPCLK[SNPMD] after RESET has gone inactive, 
clocked mode will be selected. Systems using 
stobed or synchronous mode must ensure that no 
rising edge occur on SNPCLK [SNPMD] after RE- 
SET has gone Inactive. 

Figure 6-4 shows the strobed method of snoop ini- 
tiation. The memory address, SNPNCA, SNPINV, 
and MBAOE# are latched with the falling edge of 
the SNPSTB#. If MAOE# is sampled active (low), 
the SNPSTB# will not cause a snoop. The snoop 
initiation is recognized by the 82495XP, Is synchro- 
nized In the next clock, and causes a snoop in the 
following clock. 


SNPCYC# 


■\ / T 


— )00000C 

— )00000C 
)00000C 
)00000C 

— )00000C 

— )OOOOOC 


7 


■m/mmmmmmmmmm 

immmmmmmmmmfm 

ymmmmmmmwmm 

ymmmmmmmfmm 

ymmmmmmmmmffm 

ymmmmmmmmmmm 










240956-14 


2 


Figure 6-4. Strobed Snoop Mode 
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Figure 6-5 shows the clocked method of snoop ini- SNPCLK in order to rearm for another snoop. If 

tiation. The memory address, SNPNCA, SNPINV, MAOE# is sampled active (low), the SNPSTB# will 

and MBAOE# are latched with the rising edge of not cause a snoop. The snoop initiation is recog- 

SNPCLK when SNPSTB# is first sampled low. nized by the 82495XP, is synchronized in the next 

SNPSTB# must be sampled high for at least one clock, and causes a snoop In the following clock. 



Figure 6-5. Clocked Snoop Mode 
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Figure 6-6 shows the synchronous snoop mode. The 
memory address, SNPNCA, SNPINV, and MBAOE# 
are latched with the rising edge of CLK when 
SNPSTB# is first sampled low. SNPSTB# must be 
sampled high for at least one CLK in order to rearm 


for another snoop. If MAOE# is sampled active 
(low), the SNPSTB# will not cause a snoop. The 
snoop initiation is recognized by the 82495XP, and 
causes a snoop In the next clock. 
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Figure 6-6. Synchronous Snoop Mode 
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6.2.2 RESPONSE PHASE 

The snoop response phase consists of two parts: 
1) 82495XP state indication 2) 82495XP snoop pro- 
cessing completion. The response phase is AL- 
WAYS synchronous with the CPU CLK. The 
82495XP state indication is presented on MHITM# 
and MTHIT# and remains stable until the next 
snoop. These signals indicate the state of the 
82495XP line just prior to the snoop operation. The 
memory bus controller can predict the final state of 
the 82495XP line knowing the initial state and the 
SNPINV and SNPNCA inputs. The snoop comple- 
tion information is determined by the SNPBSY# out- 
put. The SNPBSY# output inactive indicates that 
the 82495XP is ready to accept another snoop cy- 
cle. 

Figure 6-7 shows the 82495XP response to snoops 
without Invalidation. The first snoop is to a line which 
is not currently stored in the cache. 

Figure 6-8 shows the 82495XP response to snoops 
with invalidation. 

The SNPBSY# signal will be activated for one of 
two reasons: 1) a snoop hit to a modified line, 
SNPBSY# will remain active until the modified line 


has been written back. 2) a Back invalidation is 
needed and there is a back Invalidation in process. 
The SNPBSY# minimum active time is two CLK pe- 
riods. This allows an external logic to trap-hold ac- 
tive SNPBSY# using CLK. The external logic must 
first look for active SNPCYC# and then trap-hold 
SNPBSY#. 


6.2.3 PIPELINED SNOOPS 

The 82495XP allows the memory bus controller to 
pipeline snoop operations. The 82495XP allows the 
next snoop address to be supplied and the next 
snoop requested before the last snoop has complet- 
ed. 

There are a set of rules which govern the operation 
of pipelined snoops. These rules are as follows: 

(1) For strobed mode snoops, the memory bus con- 
troller cannot cause a second falling edge of 
SNPSTB# until after the falling edge of 
SNPCYC#. 

(2) For clocked mode snoops, the memory bus con- 
troller cannot cause a second falling edge of 
SNPSTB# to be sampled by SNPCLK, until after 
the falling edge of SNPCYC#. 
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SNPSTB# \ / \ / \ / 
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Figure 6-7. Snoops without Invalidation 
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Figure 6-8. Snoops with Invalidation 
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(3) For synchronous mode snoops, the memory bus 
controller cannot cause a second falling edge of 
SNPSTB# to be sampled by CLK, until the CLK 
after SNPCYC# is active. 


6.2.4 OVERLAPPING SNOOPS WITH MEMORY 
BUS CYCLES 

The 82495XP allows snoops to be overlapped with 
data transfers. The 82495XP divides the memory 
bus cycle into 4 main regions as shown below: 


CRDY# CADS# BGT# SWEND# CRDY# CADS# 


Region 1 is after a previous memory bus cycle (i.e. 
after CRDY#) and before the new memory bus cy- 
cle starts (before CADS#). A snoop In this region is 
looked up immediately and serviced immediately. 


Region 2 is after a memory bus cycle has started 
(CADS#) but before the 82495XP has been granted 
the bus (BGT #). A snoop in this region is looked up 
immediately and serviced immediately. CADS# is 
re-issued for the aborted cycle once the snoop com- 
pletes. 

Region 3 is after the 82495XP has been granted the 
bus and before the SWEND# is completed. A snoop 
In this region has Its lookup blocked until after the 
SWEND#. After SWEND#, the snoop response is 
given, but no write-back will be initiated until after 
CRDY#. 

Region 4 Is after SWEND# and before CRDY#. A 
snoop in this region is looked up immediately but 
serviced after CRDY#. This snoop is logically treat- 
ed as if it occurred after CRDY # (snoop hits to mod- 
ified data will schedule a write-back which will be 
executed after the conclusion of the current memory 
bus cycle). Note that the result of the snoop 
MHITM#, MTHIT# will be available immediately 
with the look-up. 
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6.2.5 SNOOP INTERLOCK 

The 82495XP uses two interlock mechanisms to en- 
sure that Snoops are identified within the proper re- 
gion. The first interlock ensures that once a BGT # 
has been given snoops are blocked until after 
SWEND#. The second interlock ensures that once 
a snoop has been started BGT# cannot be given 
until after the snoop has been serviced. 

Figure 6-11 shows how once the 82495XP sees a 
BGT# it blocks all snoops until after SWEND#. If a 
snoop has been initiated, and no SNPCYC# has 
been issued before BGT # assertion, the snoop has 
been blocked. 

Figure 6-12 shows a snoop occurring before BGT#. 
Once the 82495XP has honored a snoop, the 
82495XP, depending on the result of the snoop, may 
ignore BGT# until the snoop is serviced. The 
82495XP will always ignore BGT # when SNPCYC# 


is active. If the snoop result is a hit to a modified line 
(MHITM# active), the 82495XP will ingore BGT # as 
long as both SNPBSY# and MHITM# remain ac- 
tive. In this case, it is the memory bus controller’s 
responsibility to hold BGT # until SNPBSY # goes 
inactive or reassert it after SNPBSY# becomes in- 
active. If the snoop result is not a hit a modified line 
(MHITM# inactive), the 82495XP is capable of ac- 
cepting BGT # even when SNPBSY# Is active. This 
allows the memory bus controller to proceed with a 
memory bus cycle by asserting BGT# while the 
82495XP is performing back-invalidations. 

These two Interlock mechanisms provide a flexible 
method of ensuring predictable handling of over- 
lapped snoops. 

NOTE: 

Even when snoops are delayed, address latching is 
performed with SNPSTB# activation. 




2-286 





82495XP Cache Controller/82490XP Cache RAM 




iny. 


6.2.6 SNOOPS CONCURRENT WITH LINE FILL 
CYCLES 

During snoops concurrent with line-fills/allocates, 
the following responsibility boundaries must be full- 
filled in order to insure data consistency: 

• If a snoop happens before BGT#, more precisely 
if SNPCYC# is active before BGT#, it is the sys- 
tem’s responsibility not to return stale data within 
the line-fill/allocation. 

• If a snoop happens after BGT # , more precisely if 
SNPCYC# is active after BGT#, then the 
82495XP insures data consistency by providing 
interlocks with the CPU which avoid caching of 
stale data. 


6.3 Memory Bus Controller Interface 
Rules 

To begin a cache cycle, the 82495XP outputs the 
CADS# signal. The cache address and other cycle 
parameters are guaranteed to be stable with 
CADS# assertion. These parameters are guaran- 
teed to be stable until CNA# or CRDY# of that cy- 
cle. After CNA# or CRDY# these parameters are 
undefined. 

Either during, or after CADS# the CDTS# signal is 
asserted. Data is guaranteed to stable with CDTS# 
assertion, or the data path is available. 


BGT# and CRDY# are required for all (non-snoop) 
cycles. KWEND# and SWEND# are only required 
for those cycles which sample them. 

Once a signal has been sampled, it is a “don’t care’’ 
until CRDY# of that cycle. Additionally, these sig- 
nals plus the attributes MRO#, MKEN#, MWB/ 
WT#, and DRCTM# need only follow setup and 
hold times when they are being sampled. 

For pipelined cycles, the cycle attributes (BGT#, 
KWEND#, . . . ) will only be sampled after CRDY# 
of the previous cycle. 

Note that there are many other rules that govern 
when signals may be asserted in relation to one an- 
other. These may be found in the specific pin de- 
scriptions of each signal in chapter 7. 

Snoop-Write-Back cycles are a subset of the normal 
cycles. Snoop-Write-Back cycles are requested as a 
consequence of snoop hits to Modified lines. Those 
are intervening cycles and are requested by activat- 
ing SNPADS# instead of CADS#. For those cycles, 
the 82495XP only samples the CRDY# response. 
The 82495XP assumes that the memory bus con- 
troller owns the bus to perform the intervening write- 
back (restricted back-off protocol) and that no other 
agents will snoop this cycle. Also the 82495XP will 
Ignore CNA# during Snoop-Write-Backs. 
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Figure 6-13. Cycle Progress 
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Figure 6-14. Cycle Progress for Snoop Cycles 


6.4 LOCK# Protocol 

The 82495XP provides a LOCK signal for the memo- 
ry bus called KLOCK#. KLOCK# is generated by 
the 82495XP whenever the CPU generates the 
LOCK# signal. KLOCK#, like the other cycle attri- 
butes, is valid with CADS# assertion. 

When the CPU generates a LOCK cycle, the 
82495XP always generates a bus cycle. LOCK cy- 
cles are non-cacheable to both the 82495XP and 
CPU, so the information is passed through the 
82490XPS to the CPU with BRDYs generated by the 
MBC. If the LOCKed read cycle is a hit in the 
82495XP, the 82495XP ignores the data that it is 
receiving and supplies data from the 82490XP array 
(in accordance with the BRDYs supplied by the 
MBC). Locked writes are posted like any other write. 
LOCKed cycles, both reads and writes, never 
change the 82495XP tag state. 

During a LOCKed cycle, the MBC must prevent oth- 
er masters from snooping the 82495XP. Specifically, 
the MBC must prevent SNPSTB# between BGT# 
of the first LOCKed transfer, and SWEND# of the 
last LOCKed transfer. 


6.5 Cycle Length 

When CADS# is generated, the 82495XP outputs 
CW/R# and MCACHE#. These signals provide the 
MBC with enough information to determine the type 
of 82495XP cycle. Table 6-1 summarizes the cycle 
types for the 82495XP/82490XP. All line-fills and 
write-backs to the 82495XP/82490XP cache oper- 
ate on the entire length of a cache line. 

In addition to the length of the cycle from the 
82495XP/82490XP, the memory bus controller may 
need to determine the length of the cycle to the 
CPU. Specifically, for those 82495XP cycles where 
RDYSRC=1, the MBC must decode the i860 XP 
CPU’s W/R#, LEN, and CACHE# outputs to deter- 
mine the number of BRDY#s which the MBC will 
provide to the CPU. These signals are captured for 
the current cycle by a user-provided BE latch (see 
Section 7.2 for details). Table 6-2 presents the CPU 
cycle length definitions; see the i860 XP microproc- 
essor Data Sheet (Order #240874) for further de- 
tails. 
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Figure 6-15. Snooping During LOCKed Cycies 
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Table 6-1. 82495XP/82490XP Cycle Determination 


Cycle Type 

CW/R# 

RDYSRC 

MCACHE# 

MKEN# 

Posted Write 

1 

0 

1 

X 

Write Backs 

1 

0 

0 

X 

Non-Cacheable Read 

0 

1 

1 

X 

Non-Cacheable Read 

0 

1 

0 

1 

Cacheable Read 

0 

1 

0 

0 

Allocation 

0 

0 

0 

X 


Table 6-2. i860 XP CPU Cycie Determination 


W/R# 

LEN 

CACHE# 

MKEN# 

Cycle Description 

Burst Length 

0 

0 

1 . 

— 

Non-Cacheable 64-Bit Read 

1 

0 

0 


1 

Non-Cacheable 64-Blt Read 

1 

1 

0 

1 

— 

64-Blt Write 

1 

— 

0 

1 

— 

I/O and Special Cycles 

1 

0 

1 

1 

— 

Non-Cacheable 1 28-Blt Read 

2 

0 

1 

— 

1 

Non-Cacheable 1 28-Bit Read 

2 

1 

1 

1 

— 

1 28-Blt Write 

2 

0 

— 

0 

0 

Cache Line Fill 

4 

1 

— 

0 

— 

Cache Write-Back 

4 


NOTE: 

If MRO# is asserted to the 82495XP, the effect on i860 XP CPU cycle determination is the same as when MKEN# = 1. 


6.6 Consecutive Cycles 

Because a 82495XP line can be longer than a CPU 
line, there are circumstances where a read miss will 
be to a line that is currently being filled. If this is the 
case, the 82495XP treats this like a read hit, but 
supplies data after CRDY# for the line fill. Data is 
supplied from the 82490XP array. 


6.7 CPU/Memoty Bus Concurrency 

The 82495XP allows concurrency between the CPU 
and memory buses. CPU bus cycles will either be 
serviced locally by the 82495XP (hits) or require 
memory bus service. Whenever a CPU cycle re- 
quires memory bus service, it will be scheduled to 
run on the memory bus, and CPU bus activity will be 
allowed to continue. 

Examples of concurrency are: 

— Snoops and CPU bus operations 

— Posted writes with CPU and memory bus opera- 
tions 


— CPU bus operation on the back of long line fills 
(82495XP line longer than the CPU line) 

— Allocations and replacements with CPU and 
memory bus operations. 

In certain cases, consistency of data and prevention 
of deadlocks preclude concurrency. Problems may 
occur when the current memory bus cycle changes 
the tag state and therefore affects the operation of 
the next CPU cycle request. In those cases the 
82495XP will hold concurrency to ensure data con- 
sistency. Handling of those cases is completely 
transparent to the MBC. 

The 82490XP supports two modes of memory bus 
operation: clocked mode and strobed mode. In 
clocked mode, memory bus signals are sampled by 
the 82490XP on rising edges of MCLK. Similarly, 
memory bus data and signals are output by the 
82490XP with respect to MCLK (or MOCLK) rising 
edge transitions. 

In strobed mode, memory bus signals are sampled 
or output with respect to rising and falling edges of 
other signals. Strobed mode has the advantage of 
not requiring setup and hold times to a CLK or MCLK 
edge. 
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6.8 Memory Bus Modes 



Mode Sampling 


6.8.1 CLOCKED MODE 

In clocked mode operation MCLK is used to refer- 
ence the signals MDATA0-MDATA7, MSEL#, 
MFRZ#, MZBT#, MBRDY#, and MEOC#. Clocked 
mode will be selected if the 82490XP detects a 
clock at its MCLK input after RESET. MCLK need 
not have any relation to CLK. If this is the case, the 
memory bus is said to be operating in “clocked 
asynchronous” mode. If MCLK = CLK, the memory 
bus Is operating In “clocked synchronous” mode. If 
MCLK X N = CLK (where N = 2, 3, 4 ... ), the 
memory bus is operating In “clocked divided syn- 
chronous” mode. These three clocked modes, asyn- 
chronous, synchronous, and divided synchronous, 
are not differentiated by the 82490XP. 

MOCLK controls a transparent latch at the 82490XP 
data output pins. If a clock Is provided at this input, 
data is latched with MOCLK going low. This clock Is 
available in clocked mode only. MOCLK allows the 
system to provide a greater MDATA hold time by 
skewing MOCLK from MCLK. If MOCLK is tied high, 
MDATA is driven from MCLK. 

6.8.1. 1 Synchronous Clocked Mode 

In synchronous clocked mode MCLK = CLK. This 
means the CPU clock is used for 82495XP, 
82490XP, and the memory bus. A synchronous 
memory bus allows memory to communicate with 
the 82495XP without synchronizers since the 
82495XP runs with CLK. With a synchronous design, 
however, high clock frequencies must be routed to 
all parts of a system with minimal skew. This may 
not be possible with future projected frequencies. A 
synchronous memory system and memory bus con- 
troller must be redesigned when future speed up- 
grades are required. 


6.8.1. 2 Asynchronous Clocked Mode 

In asynchronous clocked mode, MCLK is not the 
same frequency as CLK. Some memory signals, 
since they reference MCLK, must be synchronized 
to CLK to communicate with the 82495XP. For ex- 
ample, when a cycle completes, the memory system 
asserts a signal, driven from MCLK, to the memory 
bus controller which will be synchronized to CLK to 
become CRDY#. This is because CRDY# is syn- 
chronous to CLK and not MCLK. 

Asynchronous mode allows the rest of the system to 
run at a lower frequency than the CPU CLK. Not only 
does this simplify system design, but allows the de- 
signer to place hooks to allow the same design to 
scale easily to a higher frequency. If all the features 
of the 82495XP are used properly, an asynchronous 
memory design does not have to incur much syn- 
chronization penalty. For example, MEOC# is syn- 
chronous to the memory environment (MCLK). This 
allows the memory system to end the current cycle 
and start the next before CRDY# is synchronized in 
the CPU environment. 


6.8.1. 3 Divided Synchronous Ciocked Mode 

Divided synchronous clocked mode is a subset of 
synchronous clocked mode. It allows two things to 
happen: One, the memory system is capable of 
communicating with the 82495XP without synchroni- 
zation. Two, a slower frequency clock may be routed 
around the system. 

Divided synchronous mode still requires clock skew 
restrictions. It also carries the same scalability draw- 
backs that full synchronous mode does. 

6.8.2 STROBED MODE 

Strobed mode is configured on the 82490XP by 
strapping MCLK high. In strobed mode: 

— MDATA0-MDATA7 are sampled with respect to 
edges of MEOC#, MISTB, and MOSTB. 

— For write cycles, MFRZ# is sampled when 
MEOC# goes active. 

— MZBT# is sampled when MSEL# is Inactive, 
and is latched when MSEL# goes active. 
MZBT # is also sampled for the next operation 
when MSEL# Is active and MEOC# goes active. 

By not using MCLK, strobed mode has no setup and 
hold time restrictions, and is scalable to higher fre- 
quencies. Strobed mode does, however, require 
synchronization to 82495XP CLK synchronous sig- 
nals. 
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6.9 Memory Bus Operation 


6.9.2 MEMORY CYCLE BUFFERS 


All data is handled by the 82490XP cache RAMs. 
The 82495XP instructs the 82490XP whether to use 
the data array or buffers, and specifically which buff- 
er to use. The MBC is responsible for bursting data 
in and out of the 82490XP’s, in and out of the CPU 
during miss cycles, and indicating when the opera- 
tion is finished. Communication between the 
82490XPS and memory bus may be done in a 
clocked mode or strobed mode. See the Memory 
Bus Modes section for more details. 

A 82490XP has 4 memory buffers. It has 2 memory 
cycle buffers, one write-back buffer, and one snoop 
buffer. Each buffer is capable of holding an entire 
82495XP line of the longest configurable length. 

The memory cycle buffers of the 82490XP are used 
for posting writes and holding data during 
82495XP/82490XP line-fills. The write-back buffer is 
used for holding data from a cache replacement. 
This data is ready to be written out, and the write- 
back buffer is snoopable. The snoop buffer is used 
to hold modified data that has been hit by a snoop. 
Since snoop hits are the highest priority cycle, this 
buffer will be emptied before any other cycle or 
snoop request begins. 


There are 2 memory cycle buffers in the 82490XP. 
They are used for line-fills, allocates, and memory 
writes. The buffers are 64-bits wide (per 82490XP) to 
support 8 transfers with 8 memory bus I/O pins 
(maximum configuration). The 82490XP alternates 
use of these buffers. When one buffer has a posted 
write or is being used for a memory read, the other 
one is available for the next cycle. 


During allocation cycles, read for ownership may be 
implemented by using the MFRZ# signal. If MFRZ# 
is sampled active during the write cycle, the memory 
cycle buffer will freeze the write data in the buffer so 
the subsequent line-fill fills around It. This way the 
write cycle need not be written to memory. The line 
must be tagged as modified. 


6.9.3 WRITE BACK BUFFER AND SNOOP 
BUFFER 



The write back buffer and snoop buffer are both 64- 
bits to handle the maximum 82495XP line length. 
The write back buffer Is used when replaced data 
must be written back to main memory (including 
FLUSH and SYNC cycles) and the snoop buffer is 
used when data must be written out from a snoop 
hit. 


6.9.1 82490XP BUFFERS AND MUXES 

The 4 82490XP memory buffers are all multiplexed 
(muxed) to the memory bus. The mux is used to se- 
lect which buffer Is on the bus, and specifically which 
slice of that buffer is on the bus. MBRDY# assertion 
increments a counter for this mux which selects the 
next slice of that buffer. 

The counter used to increment through the buffer 
slices is called the memory burst counter. The mem- 
ory burst counter follows the CPU burst order de- 
pending on the subline address of the Initial slice. 
Once the MBC is finished with a buffer, MEOC# Is 
asserted to switch the mux to the next buffer to be 
used. MEOC# will also reset the counter and latch 
the last slice of data. 


Before a line fill begins, the 82495XP checks to see 
If it must remove a modified line to make room for 
the line-filled line. If so, the modified line is placed in 
the write back buffer and the line-fill is filled through 
a memory cycle buffer. Should the line-fill be select- 
ed as non-cacheable, both buffer contents are dis- 
carded and the 82490XP array value remains as it 
was before the line-fill. 

There Is no need to run the line-fill, replacement 
(write back), FLUSH, or SYNC cycles contiguously. 
If a snoop is requested between the two cycles, the 
write back buffer Is snoopable, and data can be writ- 
ten directly out of it if need be. 

6.9.4 MEMORY BUS CONTROL SIGNALS 


On the CPU side, the 82490XP contains a CPU buff- 
er and mux. The CPU buffer captures data from the 
appropriate memory buffer or 82490XP array to feed 
it to the CPU. The mux selects which slice is muxed 
to the CPU bus. The counter for this mux is incre- 
mented with BRDY#. 

The 82490XP array contains a mux that selects 
which way, based on the MRU algorithm, will be 
read during hit cycles. This mux is used during write 
cycles to write to the correct way. 


The main memory bus control signals are MSEL#, 
MEOC#, MBRDY#, and CRDY#. These signals 
control the 82490XP data path, buffers, and muxes. 

MSEL# selects which 82490XPs are being used In 
the current cycle by qualifying the MBRDY# signal. 
If MSEL# is Inactive, MBRDY# is not recognized for 
that 82490XP. MSEL# is also used to reset the 
memory burst counter. If MSEL# goes Inactive, the 
counter is Initialized to its starting value. This is use- 
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ful for aborted /restarted cycles. MSEL# may remain 0 , 9,5 82490XP DATA PATH 

active for many or all cycles. MSEL# must, howev- 
er, be inactive sometime after RESET to initialize the An example 82490XP read data path is shown in 

memory burst counter for the first time. Figure 6-6. The path between the CPU and memory 

bus is a flow-thru’ path, not a clocked path. Each 
MEOC# is asserted by the MBC to end finish with entire 82495XP cache line of data in the CPU buffer 

the current buffer, and switch the memory bus to the is available at the memory buffer with some propa- 

next buffer to be used. MEOC# latches in the last gation delay. Likewise, each entire 82495XP cache 

piece of data and resets the memory burst counter line of data in the memory buffer is available in the 

before switching to the new buffer. CPU buffer with some propagation delay. Data Is 

burst into and out of the memory buffer using 
MBRDY# is used to increment the memory burst MBRDY# or MISTB/MOSTB. Data Is burst into and 

counter to select the next slice of data. This will out of the CPU buffer using BRDY#. This means 

strobe data out of the 82490XP (write cycles) or load there is no synchronization required between memo- 

data into the 82490XP (read cycles). MBRDY# is ry and CPU data paths, 

ignored by the 82490XP if MSEL# is inactive. 

To give an example how the path works, during a 
CRDY# finishes the current cycle. Once CRDY# is CPU line fill, data may be returned to the CPU in two 

asserted, the 82490XP disposes of the information different fashions. One, each time the memory buff- 

in the buffers used in that cycle, and loads informa- er fills a dword, BRDY # may be asserted a clock 

tion into the 82490XP array. CRDY# must be as- later to burst it back to the CPU. Two, the memory 

sorted on the clock or after MEOC# Is asserted for buffer can be filled and then BRDY# asserted on 

a particular cycle. four consecutive clocks to burst data back to the 

CPU. 



Figure 6-17. 82490XP Read Data Path 
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6.9.6 WRITE CYCLES 

There are 3 basic types of write cycles: CPU gener- 
ated write cycles, write back cycles caused by a 
cache replacement, and snoop write back cycles 
caused by a snoop hit. All write cycles, except the 
snoop write back, begin with CADS# assertion. The 
snoop write back cycle begins with SNPADS#. 


A snoop hit to a modified location causes a line of 
cache data to be written out to memory. Snoop hits 
are the highest priority cycle and must be serviced 
immediately. A snoop hit to a modified location caus- 
es the snooped line to be written to the snoop buffer 
of the 82490XP. SNPADS# is then asserted and the 
snoop is written out. 


6.9.6. 1 CPU Generated Write Cycles 

When the CPU begins a write cycle, four things can 
happen to it. One, the CPU write is a hit to a modi- 
fied or exclusive line. In this case the write is termi- 
nated by the cache immediately and invisibly to the 
MBC. 

Two, the write is to a shared location. This type of 
write is posted to the 82490XP memory cycle buffers 
and the cycle is terminated by the cache. If a memo- 
ry cycle buffer is occupied with a write cycle, the 
CPU waits until the previous write completes. The 
write cycle must be written to the memory bus so 
that other copies of the write in other caches be 
invalidated. 

Three, the write is a cache miss. This type of write is 
posted to a memory cycle buffer if the 82490XP is 
not waiting for another posted write to complete. If 
PALLC# is asserted, the write may be turned into an 
allocation. 

Four, the write is a LOCKed write. LOCKed writes 
are posted regardless of the tag state. The write is 
then treated as if it were a miss except that there is 
no change in the tag state and no allocation allowed. 


6.9.6.3 Memory Bus Controller Responsibility 

The MBC recognizes a write cycle with CADS# and 
CW/R# (or SNPADS# for snoop cycles). If 
MCACHE# is active, the MBC knows the cycle is a 
write back cycle, otherwise it is a CPU-generated 
cycle. 


CPU-generated write cycles are written to the main 
memory bus so that other caches can invalidate 
their copies of this information. The other caches do 
this by snooping with SNPINV active during snoop 
initiation if they detect a write cycle on the bus. 



Once the MBC detects CDTS# active, the data will 
be available for writing in the next clock in the appro- 
priate 82490XP buffer. The MBC should assert 
MSEL# so bursting Is enabled, and burst through 
the write using MBRDY# (or MOSTB). MSEL# acti- 
vation also caused MZBT # to be sampled. MZBT # 
must be inactive at this time if the data will be written 
according to CPU burst order. 


Once the write cycle Is complete, MEOC# must be 
asserted to end the write cycle and switch to the 
next pending cycle. If this write cycle is turned into 
an allocation, MFRZ# is sampled with MEOC# to 
freeze the write data In the 82490XP. 


6.9.6.2 Cache Generated Write Cycles 

The 82495XP/82490XP will generate a write cycle In 
three situations: a line fill or allocation causing a 
cache replacement, a snoop hit to a modified loca- 
tion, and write backs caused by FLUSH or SYNC. 
Write back caused by FLUSH or SYNC are indestin- 
guishable from write-back cycles caused by replace- 
ment. Cache generated write cycles are the length 
of a cache line. 

Cache replacements and FLUSH/SYNC cycles 
cause a line (or two lines if sectored) of cache data 
to be placed in the write-back buffer of the 82490XP. 
If no cycle is pending, CADS# is asserted and the 
data is written out. If a snoop hits the write-back 
buffer, the data is written out via SNPADS# like a 
normal snoop hit. The write back is then cancelled 
since the data was written through the snoop hit. 


MEOC# simply switches buffers from the current 
one in use to the buffer of the next pending cycle. 
CRDY# needs to be asserted to actually end the 
cycle and allow the 82495XP and 82490XP to dis- 
pose of the information. 

6.9.6.4 Write Allocation and Read for 
Ownership 

The 82495XP/82490XP supports write allocation. 
An allocation cycle Is a read of a cache line caused 
by a write miss to the same location. In its simplest 
form, a write miss is written to memory, then the 
82495XP requests a line from that same location. 
Meanwhile, the CPU pnly sees the write cycle. 

Write allocation may only be done if PALLC# Is ac- 
tive during CADS# of the write cycle. For the alloca- 
tion to occur, MKEN# must be returned active dur- 
ing KWEND# of the write cycle. The write cycle may 
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be an actual write or a “dummy” write. Dummy 
writes are write cycles that are terminated in the 
82495XP and 82490XP as if they were normal 
writes, but the data Is not actually written to memory. 
This saves a data write to memory. 

During write allocation, the write cycle will progress 
like a normal write cycle except MKEN# must be 
active during KWEND# activation. If the write cycle 
is a dummy write, MFRZ# must be used with 
MEOC# so that the line filled data is read around 
the write data Into the 82490XP buffer. The line fill 
cycle is like any other line fill cycle except the CPU 
doesn’t get any data. If a dummy write was per- 
formed, DRCTM# must be asserted during 
SWEND# activation to fill the line to the M state, 
and any cache supplying the data must invalidate Its 
copy. 

Using dummy write cycles and filling data to the M 
state from another cache or memory is called Read 
For Ownership. This is because ownership is being 
transferred. In read for ownership cycles, memory is 
avoided as much as possible. First, the dummy write 
cycle avoids memory. Second, a line fill is performed 
as a cache to cache transfer with DRCTM# assert- 
ed. All caches were snooped with invalidate to elimi- 
nate their copies. 

For allocation cycles, SWEND# Is not sampled for 
the write portion of the allocation. 

6.9.7 READ CYCLES 

The CPU initiates all read cycles. These are usually 
line fills to the CPU and line fills to the 
82495XP/82490XP. The signal MCACHE# is output 
with CADS# to indicate whether this cycle may or 
may not be cacheable. If cacheable, MKEN# is re- 
turned by the MBC to ultimately determine cachea- 
billty. 

Read hit cycles are serviced by the cache without 
MBC intervention. The only read cycles seen by the 
MBC (except I/O or special) are read misses and 
locked read cycles. 

Read misses cause CADS# to be asserted at most 
two clocks after ADS# of the CPU read cycle. If 
cacheable, as determined from MCACHE#, the 
MBC will return 4 BRDYs back to the CPU and 4 or 8 
MBRDYs to the 82495XP/82490XP. If the transfer is 
non-cacheable, the i860 XP CPU LEN and CACHE# 
outputs Indicate the number of transfers to be given 
to the CPU. MBRDY# need not be used in the 
transfer if only a single piece of data is required by 
the CPU. 


If the read cycle is cacheable. It may cause another 
cached line to be bumped out of the cache. This Is 
called a replacement and, if modified, causes a write 
back cycle. While one of the 82490XP memory buff- 
ers is being filled for the line fill, the write back buffer 
is loaded. If the line fill turns out to be non-cache- 
able at the end of the transfer, the write-back buffer 
is discarded, and the line in the cache remains valid. 
Otherwise, CADS# will be generated after the read 
cycle so the write back can be performed. The write 
back need not happen immediately after the line fill 
since the write-back buffer is snoopable. 

All locked reads go to the memory bus. If the read Is 
a cache hit to M’, the 82495XP/82490XP will ignore 
the data that the MBC returns, and provide It from its 
array. Locked reads are not cacheable by the CPU 
or the 82495XP/82490XP. Snoop write-backs that 
are a result of a LOCKed read/write request must 
update memory. 

6.9.7.1 Memory Bus Controller Responsibility 

Once the MBC sees a read cycle on the memory 
bus, it must determine whether the read is cache- 
able or non-cacheable using MCACHE# and its own 
address decoding. If non-cacheable, the CPU ex- 
pects a number of transfers as determined by its 
LEN and CACHE# outputs. If cacheable, the CPU 
expects 4 transfers, and the cache dxpects 4 or 8 
(configuration dependent). 

MKEN# is sampled during KWEND# to determine 
cacheability. Before MKEN# Is sampled, KEN# is 
active assuming cacheability for the CPU. MKEN# 
must be sampled 1 clock before the first BRDY# to 
make the cycle non-cacheable. 

Once the read cycle is given to the memory system, 
all 82495XP/82490XP caches snoop to see if they 
contain the data in modified form. If so, the MBC 
must abort the cycle in memory and receive the data 
directly from the 82495XP/82490XP that has it, or 
wait until that cache writes it to memory. If the data 
transfer avoids memory, ie goes cache to cache, 
DRCTM# must be asserted with SWEND# to place 
the line in the M’ state and the cache giving the data 
must Invalidate its copy. 

MSEL# is activated and MBRDY# (or MISTB) used 
to sample Input data from the read cycle. Once 
CDTS# has been seen active, the CPU read data 
path is clear. BRDY# may be returned to the CPU 
sometime after each MBRDY# for each piece of in- 
put data (see MDATA setup to CLK). Once the 
transfer completes, MEOC# and CRDY# are as- 
serted to complete the cycle in the 82495XP/ 
82490XP. 
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6.9.8 I/O AND SPECIAL CYCLES 

I/O and special cycles (flush, etc) are decoded by 
the 82495XP and not posted. These cycles wait until 
all buffers have been written, and all cycles have 
been completed, before they cause CADS# asser- 
tion. The CPU waits until the special cycle ends with 
the MBC’s BRDY# assertion before it continues. 

When the 82495XP/82490XP is performing a 
FLUSH or SYNC, many write back cycles are re- 
quired. These cycles look like ordinary write back 
cycles, and should be handled as such. FSIOUT # is 
active during these write back cycles, so when FSI- 
OUT # goes inactive the cycle is complete and the 
memory bus controller can supply BRDY# to the 
CPU. 


6.10 Different Bus Widths 


In this example, the CPU port of the 82490XPs is in 
x4 mode and the memory bus port is in x8 mode. 
This allows all 128 bits of the memory bus to be 
multiplexed to the 64-bit CPU bus. 

For read cycles, each MBRDY# loads 8 bits into 
each 82490XP. This is 128-bits of data. It will take 2 
BRDY# assertions to load this into the CPU. The 
first BRDY# assertion loads the first 4 bits onto the 
CPU bus, and the next BRDY# assertion loads the 
remaining 4 bits. 


For a 64-bit write cycle, the data is available at the 
on the appropriate data bits. On the i860 XP CPU 
with a 128-bit bus, this is determined by CPU ad- 
dress bit A3. The other data bits are undefined. For 
write-back cycles, all 128 bits are available at once. 
MBRDY# assertion will strobe the next 128 bits on 
the memory bus. 



The 82490XP is capable of supporting either 64- or 
128-bit memory bus widths. Depending on the con- 
figuration, the 82490XP’s CPU and I/O busses may 
be multiplexed. The following diagram shows how 
an i860 XP CPU may be connected to a 128-bit 
memory bus: 
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Figure 6-18. 82490XP On Wide Bus 


7.0 DETAILED PIN DESCRIPTIONS 

The following chapter provides a detailed descrip- 
tion of each pin of the 82495XP and 82490XP. The 
pins have been categorized by function. Each pin 
description has a heading which summarizes the 
most important aspects of the pin. The heading is 
organized as: 


Pin Name 

Name Meaning 
Pin Function 

I/O, 82495XP/82490XP/i860 XP CPU, (location) 
Signal Type 

Synchronous/ Asynchronous 
for example. 


CADS# 

Cache Address Strobe 

Indicates beginning of cache cycle 

Output from 82495XP (pin E3) Cycle Control Signal 

Synchronous to CLK 

Following the heading are three sections. The first 
section. Signal Description, provides information of 
what the signal does, how to use it, and in what 
modes It operates. The second section, When Sam- 
pled or When Driven, Indicates ail the exact places 
where the part samples the signal, generates the 
signal, or neither. The third section. Relation to Oth- 
er Signals, mentions the other signals that are af- 
fected by this signal, synchronization requirements, 
and shared pins. 
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All specific information about each pin is provided in 
this chapter. 


7.0.1 CONFIGURATION SIGNALS 


These signals are inputs to the 82495XP and 
82490XP that are sampled at RESET and alter the 
configuration and operation of the cache. 



Figure 7-1. Configuration Input Setup and Hold 


Each set of configuration inputs may have different 
setup times, but all signals have the same hold time: 
The signals may be released on the CPU clock edge 
that RESET is detected inactive. There are some 
configuration signals that are strapping options and 
cannot change their value during 82495XP opera- 
tion. 


7.0.2 CPU BUS INTERFACE SIGNALS 

These pins comprise the interface between CPU 
and 82495XP/82490XP. The signals in this interface 
are not flexible; Chapter 10 addresses the use of 
these signals. The following are the CPU bus inter- 
face signals: 


SET0~SET10 

TAG0-TAG11 

CFA0-CFA6 

ADS# 

W/R# 

D/C# 

M/IO# 

HITM# 

LOCK# 

PWT 

PCD 

LEN 

BRDYC1 # 

KEN# 

AHOLD 

EADS# 

BOFF# 

BE0-BE7# 

INV 


The majority of these signals must be connected 
strictly between the i860 XP CPU and the 82495XP. 
However, a subset of these signals is needed by the 
MBC to decode the i860 XP CPU cycle in cases 
where the MBC provides BRDYs to the CPU. For 
these purposes the following signals must also be 
inputs to a latch controlled by the 82495XP’s BLE# 
output: 

BE0#-BE7# CACHE# CTYP 

LEN PCD PCYC 

PWT 


7.0.3 82495XP/82490XP INTERFACE SIGNALS 

These pins comprise the interface between the 
82495XP and 82490XP. The 82495XP uses these 
pins to control the 82490XP and its buffers. The sig- 
nals in this interface are not flexible; Chapter 1 0 ad- 
dresses the use of these signals. The following are 
the 82495XP/82490XP Interface signals: 

WRARR# WAY MAWEA# 

BUS# MCYC# WBWE#[LR1] 

WBAISEC2] WBTYP[LR0] BRDYC2# 

BLAST# BOFF# 


SIGNAL DESCRIPTIONS 


7.1 BGT# 

Bus Guaranteed Transfer 

Signals 82495XP of memory bus controller’s com- 
mitment to complete the bus cycle. 

Input to 82495XP (pin M4) Cycle Progress Signal 
Synchronous 

7.1.1 SIGNAL DESCRIPTION 

The 82495XP owns all bus cycles (initiated by 
CADS#) until the memory bus controller accepts 
ownership. During this time cycles may be aborted 
due to a snoop. The memory bus controller signals 
its acceptance of ownership by driving BGT # active 
into the 82495XP. Once BGT # is driven active, the 
memory bus controller Is responsible for completing 
the cycle on the memory bus. CRDY # signals com- 
pletion of the cycle. 

Once BGT # is asserted, other devices may not per- 
form snoops into the 82495XP until the end of the 
snooping window, SWEND# activation. The snoop 
address is latched if SWEND is asserted between 
BGT # and SWEND#, but the snoop does not begin 
until after SWEND# Is asserted. SNPCYC# will not 
be asserted until the snoop window ends with 
SWEND# asserted. The advantage of asserting 
BGT# early is that It allows the 82495XP to start 
inquiries to the CPU, load the write-back buffer, and 
progress forward in the CPU bus pipeline. The disad- 
vantage is that snooping of this 82495XP is now 
blocked until SWEND# is asserted. 


7.1.2 WHEN SAMPLED 

After the 82495XP asserts CADS#, it begins sam- 
pling BGT # until it Is sampled active. 

BGT# is a “Don’t Care’’ after it has been recog- 
nized for cycle N and prior to the assertion of 
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CADS# for cycle N + 1. In addition, BGT# is a 
“Don’t Care” once a cycle started by CADS# is 
aborted by a snoop, until the cycle is restored by the 
re-issueing of CADS#. 

7.1.3 RELATION TO OTHER SIGNALS 

When implementing BGT # in the MBC the following 
rules should be used: 

1. BGT# must follow every assertion of CADS#, 
unless the cycle Is aborted due to a snoop. 

2. It must proceed CRDY# (for line fills and alloca- 
tions BGT# must proceed CRDY# by at least 3 
CLKS). 

3. In addition BGT # must be asserted with or be- 
fore the assertion of KWEND# and SWEND#. 

4. BGT # must be asserted with or before the asser- 
tion of BRDY# by the MBC. 

5. BGT# is not required following the assertion of 
SNPADS#. 

6. BGT# must be asserted with or before MEOC# 
is asserted. 


7.2 BLE# 

BE Latch Enable 

Controls latching of i860 XP CPU’s byte enable and 
cycle attribute signals 

Output of 82495XP (pin C16) Cycle Control Signal 
Synchronous to CLK 

7.2.1 SIGNAL DESCRIPTION 

BLE# is used to control the enable line of an exter- 
nal latch (clock edge triggered ’377 type). This latch 
is used to capture the i860 XP CPU’s byte enables 
(BE0#-BE7#) and CPU cycle attribute signals 
which do not go through the 82495XP. The 82495XP 
manages the opening and closing of this latch: when 
BLE# is active, new values from the CPU enter the 
latch at each rising edge of CLK. 

The 82495XP latches the byte enables after ADS# 
of a memory bus bound cycle. It relatches this infor- 
mation with CRDY# or CNA# of that cycle if anoth- 
er cycle is pending. 


7.2.2 WHEN DRIVEN 

The 82495XP latches the BE latch signals 1 clock 
after ADS# of a memory-bound cycle. Thus latched 
BE0#-BE7# are valid with CADS#. The 82495XP 
opens, then closes this latch if a cycle is pending 
and CNA# or CRDY# is asserted. Thus latched 
BE0#-BE7# are valid two clocks after CNA# or 


CRDY#, which is one clock AFTER CADS# for 
back-to-back cycles. The signals latched in the BE 
latch are only valid for CPU generated memory bus 
cycles (ie, not a 82495XP generated writeback or 
allocation). 


7.2.3 RELATION TO OTHER SIGNALS 

The following CPU signals must be latched in the BE 
latch: 

BE0#-BE7# CACHE# CTYP 

LEN PCD PCYC 

PWT 

All other signals in the 82495XP to CPU interface 
(listed in sec. 7.0.2) must be connected only be- 
tween the i860 XP CPU and the 82495XP. 


7.3 BRDY# 

Burst Ready 

Memory Bus Controller Burst Ready input to 
82495XP, 82490XP, and i860 XP CPU 

Input to 82495XP and 82490XP (82495XP pin PI, 
82490XP pin 60) Cycle Progress Signal 

Input to I860 XP CPU (BRDY2#, pin U1) 
Synchronous to CLK 

7.3.1 SIGNAL DESCRIPTION 

The BRDY# input to both the 82495XP and 
82490XP must be connected to the BRDY# signal 
which the MBC is providing to the i860 XP CPU’s 
BRDY2# pin. The signal Is used by the 82495XP for 
burst tracking purposes. In the 82490XP, it incre- 
ments the CPU latch burst counter. 

During CPU read cycles, BRDY# allows the next 32 
or 64-bit slice of read data to be available at the 
82490XP’s CDATA outputs (CPU bus) by advancing 
the CPU latch burst counter. At the same time, 
BRDY# is latching the previous slice of data into the 
I860 XP CPU. Refer to chapter 6 for more details. 

During CPU write cycles, BRDY# Is used to latch 
each slice of write data into the CPU latches and 
advance the latch counter. 

During CPU special and I/O cycles (which are not 
posted) BRDY# Is used to end the cycle. 

BRDY# must not be asserted until the bus is grant- 
ed (BGT # asserted) and until the data path is ready 
for transferring (CDTS# is asserted). 


2-297 


82495XP Cache Controller/82490XP Cache RAM 




int^. 


7.3.2 WHEN SAMPLED 

BRDY# is sampled by the CPU, the 82495XP, and 
the 82490XP at every CLK edge. It must always 
meet proper setup and hold times to CLK. Even 
though the CPU latch may not be in use, BRDY# 
assertion will still advance the latch counter. 


7.3.3 RELATION TO OTHER SIGNALS 

BRDY# controls the CPU and 82490XP CPU latch- 
es. BRDY# has the following implication rules: 

1. The last BRDY# for cycle N must be asserted 2 
clocks before MEOC# for cycle N + 1. 

2. BRDY# ^ BGT# 

3. BRDY# > CDTS# 

7.4 C490LDRV 

82490XP Low Drive Buffer 

Selects the 82495XP low capacitance driving buffers 
Input to 82495XP (pin M3) Configuration Signal 
Synchronous to CLK 


7.4.1 SIGNAL DESCRIPTION 

C490LDRV selects the driving strength of the 
82495XP buffers that interface to the 82490XP. Re- 
fer to the layout specifications for information how 
C490LDRV should be connected. 


7.4.2 WHEN SAMPLED 

C490LDRV is a configuration input sampled like Fig- 
ure 7-1 . C490LDRV requires a setup time of 4 CPU 
clocks. After sampling, C490LDRV is a “don’t care” 
until it is sampled as the BGT # pin after the first 
CADS# assertion. 


7.4.3 RELATION OT OTHER SIGNALS 

C490LDRV shares a pin with BGT#. 


7.5 CADS# 

Cache Address Strobe 

Indicates beginning of cache cycle 

Output from 82495XP (pin E3) Cycle Control Signal 

Synchronous to CLK 

7.5.1 SIGNAL DESCRIPTION 

CADS# requests the execution of a memory bus 
cycle to the MBC, and indicates that the cycle attri- 
butes (ie. CD/C#, CM/IO#, CW/R#, PALLC#, 
etc.) are valid. 

If the 82495XP receives a snoop hit to an [M] state 
line before BGT# is asserted by the MBC, the cur- 
rent CADS# is aborted and reissued after the snoop 
has completed. If the current line (issued by the 
stalled CADS#) is invalidated by the snoop, then 
that CADS# is cancelled ( ie. will not be reissued 
after the snoop is completed). 

CADS# is a glitch-free signal. 

7.5.2 WHEN DRIVEN 

CADS# is asserted by the 82495XP for exactly one 
CLK, and is always a valid logic level. 

7.5.3 RELATION TO OTHER SIGNALS 

CADS#, when asserted, indicates that the cache 
cycle control and attribute signals (ex. CD/C#, 
NENE#, CW/R#, etc.) are valid. 

Since allocations do not require BRDY#s to the 
CPU, the CDTS# of an allocation cycle will always 
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occur with CADS# of the allocation. In normal cy- 
cles the 82495XP will generate CADS# followed by 
CDTS#. 

CADS# = = CDTS# for all write-through cycles. 

Once CADS# is active, PALLC#, CWAY, CDTS#, 
and BUS# are valid. Address and cycle specifica- 
tion signals (MSET0~MSET10, MTAG0-MTAG11, 
MCFA0-MCFA6, CW/R#, CM/IO#, CD/C#, 
RDYSRC, MCACHE#, NENE#, SMLN#, KLOCK#, 
and CPLOCK#) will be valid with CADS# active as 
well. 

Every CADS# initiated cycle requires a BGT# and 
CRDY# input from the MBC. 

CADS# and SNPADS# will never be asserted on 
the same CLK. 


7.6 CAHOLD 

82495XP AHOLD Output 
Self-test result and AHOLD output status 
Output of 82495XP (pin G4) Test Signal 
Synchronous to CLK 


7.6.1 SIGNAL DESCRIPTION 

CAHOLD has two functions. One, It indicates the re- 
sult of the built-in self-tests of the 82495XP. Two, It 
represents the 82495XP AHOLD into the i860 XP 
CPU. 

The 82495XP drives CAHOLD after the 82495XP 
self-tests have completed. CAHOLD should be 
latched when FSIOUT # goes inactive after reset. If 
CAHOLD Is high, the self-tests have passed, other- 
wise they have failed. 

When the 82495XP drives AHOLD to the i860 XP 
CPU, it also drives CAHOLD, thus providing a means 
of tracking inquire cycles and back invalidations for 
performance monitoring. 

7.6.2 WHEN DRIVEN 

CAHOLD is always at a valid logic level. During self- 
test, CAHOLD is held until the clock edge that FSI- 
OUT # is sampled inactive. After self-test, or reset, 
CAHOLD Is asserted whenever the 82495XP as- 
serts AHOLD. 


7.6.3 RELATION TO OTHER SIGNALS 

CAHOLD reflects the value of AHOLD except during 
self-test. During self-test, the value of CAHOLD 
should be latched with the falling edge of FSIOUT # 
to determine pass/fail. 


7.7 CD/C# 

Cache Data/Code 

Indicates whether current cycle is Code or Data 
Output from 82495XP (pin D3) Cycle Control Signal 
Synchronous to CLK 


7.7.1 SIGNAL DESCRIPTION 

CD/C#, along with CW/R# and CM/IO#, Is a 
82495XP cycle definition signal. It Indicates the type 
of bus cycle being requested of the MBC. CD/C# 
can be pipelined by the memory bus controller (by 
using the CNA# input to the 82495XP). 



7.7.2 WHEN DRIVEN 

CD/C# Is valid in the same CLK as CADS# and 
remains valid until CRDY# or CNA#. C/DC# is al- 
ways a valid logic level. 


7.7.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSETIO, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 


7.8 CDATA0-CDATA7 

CPU Data Bus Connection 

Data Bus Connection from 82490XP to CPU 

Input/Output to 82490XP (pins 48, 54, 49, 55, 46, 
51,52,57) 

Isolated Interface 
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7.8.1 SIGNAL DESCRIPTION 

CDATAO-7 is the 82490XP data bus connection to 
the CPU. All or part of these 8 pins will be used in 
connecting the 82490XP to the CPU depending on 
the cache configuration. See layout Information for 
details. 


7.9 CDTS# 

Cache Data Strobe 

Indicates availability of CPU data/data bus 
Output from 82495XP (pin F4) Cycle Control Signal 
Synchronous to CLK 

7.9.1 SIGNAL DESCRIPTION 

For read cycles, CDTS#, when asserted, indicates 
that In the next CPU clock the data bus path is avail- 
able. This is the earliest time in Which BRDY# may 
be supplied to the CPU. For CPU initiated write cy- 
cles, it indicates that the data is available on the 
memory bus. For i860 XP CPU inquire cycles, 
CDTS# Informs the MBC that the last piece of in- 
quire data is valid on the CPU bus. 

Usage of this signal allows complete Independence 
between address strobes (CADS# and SNPADS#) 
and data strobe. CDTS# allows the 82495XP to sig- 
nal the MBC that a new cycle has begun as soon as 
addresses are available. This allows memory bus cy- 
cles to start before data Is ready to be given/taken. 

CDTS# Is a glitch-free signal. 


7.9.2 WHEN DRIVEN 

CDTS# is asserted for one CLK, at the same time or 
later than CADS# for any given cycle. 


7.9.3 RELATION TO OTHER SIGNALS 

When the MBC samples CDTS# asserted, it can 
begin providing BRDY#s for the read cycle to the 
CPU In the next CLK. CDTS# must always be as- 
serted before CRDY# and must be asserted prior to 
the first BRDY#. 

The CDTS# of an allocation will always occur with 
CADS# of the allocation. In normal cycles the 
82495XP will generate CDTS# following CADS#. 

CDTS# will be asserted at least one CLK after 
SNPADS#. 


7.10 CFG0-CFG2 

Configuration Pins 
Determine Cache Characteristics 

Input to 82495XP (pins L4, Q1, M4,) Configuration 
Signals 

Synchronous to CLK 

7.10.1 SIGNAL DESCRIPTION 

CFG0-CFG2 are the 3 cache configuration Inputs 
that determine cache characteristics such as line ra- 
tio, tag size, and lines per sector. During RESET, this 
information is passed on to the 82490XPs. The fol- 
lowing table maps CFG0-CFG2 to their respective 
configurations for the I860 XP CPU: 


Config 

No. 

Line 

Ratio 

Lines/ 

Sector 

No. of 
Tags 

CFG2 

CFG1 

CFGO 

1 

1 

1 

8K 

0 

0 

1 

2 

2 

1 

4K 

1 

1 

1 

3 

1 

2 

8K 

0 

0 

0 

4 

2 

1 

8K 

0 

1 

1 

5 

4 

1 

4K 

1 

1 

0 
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7.10.2 WHEN SAMPLED 

CFG0-CFG2 are sampled like Figure 7-1 with a set- 
up time of at least 10 CPU clocks. After sampling, 
CFGO, CFG1, and CFG2 become cycle progress in- 
put signals to the 82495XP and are sampled after 
CADS# of the first cycle. 


7.10.3 RELATION TO OTHER SIGNALS 

CFGO shares a pin with CNA#, CFG1 shares a pin 
with SWEND#, and CFG2 shares a pin with 
KWEND#. 


7.11 CLK 

i860 XP CPU, 82495XP, 82490XP Clock 
Input to the 82495XP (Dll) 

7.11.1 SIGNAL DESCRIPTION 

The CLK input determines the execution rate and 
timing of the 82495XP, 82490XP, and CPU. Pin tim- 
ings are specified relative to the rising edge of this 
signal. The I860 XP CPU, 82495XP, and 82490XP 
requires TTL levels on CLK for proper operation. 


7.12 CM/10# 

Cache Memory/10 

Indicates whether current cycle Is Memory or 10 
Output from 82495XP (D4) Cycle Control Signal 
Synchronous to CLK 


7.12.1 SIGNAL DESCRIPTION 

CM/10#, along with CW/R# and CD/C#, is a 
82495XP cycle definition signal. It indicates the type 
of bus cycle being requested of the MBC. CM/10# 
can be pipelined by the memory bus controller 
(CNA# input to the 82495XP). 


7.12.2 WHEN DRIVEN 

CM/ 10# is valid in the same CLK as CADS#, and 
remains active until CRDY# or CNA#. 


7.12.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, CW/ 
R#, CM/10#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS# assertion. 


7.13 CNA# [CFGO] 

82495XP Next Address Enable 

Dynamically pipelines CADS# cycles 

Input to 82495XP (pin L4) Cycle Progress Signal 

Synchronous to CLK 

7.13.1 SIGNAL DESCRIPTION 

CNA# is used by the MBC to dynamically pipeline 
CADS# cycles. When active It indicates to the 
82495XP that the next MBC request can be started. 
Only one level of pipelining is allowed In the 
82495XP. 

CNA# is an optional input for all cycles initiated with 
CADS#. 


7.13.2 WHEN SAMPLED 

CNA# is sampled starting in the first CLK In which 
BGT# is sampled active until CRDY# Is sampled 
active. CNA# Is then Ignored until the BGT# of the 
next cycle. 

CNA# is ignored during snoop write-back cycles. 


7.13.3 RELATION TO OTHER SIGNALS 

Once the 82495XP samples this signal active, it is- 
sues the CADS# for the next memory bus cycle as 
soon as one begins. 

CNA# is recognized between BGT# and CRDY# 
or CDTS# and CRDY# of a given cycle. 


7.14 CRDY# 

Cache Ready 

Ends a cycle in the 82495XP/82490XP 

Input to 82495XP and 82490XP (pins M2, 43) Cycle 
Progress Signal 

Synchronous to CLK 

7.14.1 SIGNAL DESCRIPTION 

CRDY# is used by the 82495XP and 82490XP to 
end a memory bus cycle. CRDY# Indicates full com- 
pletion of the cycle and allows the 
82495XP/82490XP to free internal resources for the 
next cycle. In the 82490XP, this means that the cur- 
rent memory buffer in use Is emptied (put In array, 
discarded, etc). In the 82495XP, CRDY# assertion 
allows 82495XP cycle progress signals (BGT#, 
KWEND#, SWEND#) to be sampled for the next 
cycle if pipelining is used. 
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CRDY# is required for all 82495XP/82490XP mem- 
ory bus cycles, including snoop cycles. CRDY# 
must be asserted to the 82495XP and 82490XP at 
the same time. 

7.14.2 WHEN SAMPLED 

CRDY# for a given cycle is ignored until KWEND# 
is returned for that cycle. If KWEND# is not required 
for the cycle, CRDY# is ignored until BGT#. When 
CRDY# is ignored, it may violate setup and hold 
times. 

7.14.3 RELATION TO OTHER SIGNALS 

CRDY# must be sampled by the 82495XP and 
82490XP at the same time. For the 82495XP, 
CRDY# has many cycle implication rules: 

1. CRDY# > CDTS# 

2. CRDY# > BGT# 

3. CRDY# > BGT# + 2 clocks if cycle is a line-fill 
or allocation 

4. CRDY# > KWEND# if cycle is a line-fill or write- 
through with potential allocation (PALLC# = 0) 

For the 82490XP, CRDY# has three basic rules: 

1 . MEOC# for cycle N must be sampled with or be- 
fore CRDY# for cycle N. 

2. MEOC# for cycle N + 1 must be sampled at least 
2 CPU clocks after CRDY# for cycle N. 

3. CRDY# for cycle N + 1 must be after the last 
BRDY# for cycle N. 

MBRDY# fills the current 82490XP memory buffer. 
CRDY# emties this buffer and makes it available for 
new cycles. CRDY# may be asserted on the same 
clock as MEOC# which may be asserted on the 
same clock as MBRDY#. 

CRDY# shares a pin with SLFTST#. 

7.15 CWAY 

Cache Way 

Indicates WAY used by the current cycle 
Output from 82495XP (pin J3) Cycle Control Signal 
Synchronous to CLK 


7.15.1 SIGNAL DESCRIPTION 

CWAY is a cycle definition signal which indicates to 
the MBC the WAY used by the requested cycle. On 
line-fills it indicates the way the line will be loaded. 
For write-hits (to [S] state or LOCKed) it indicates 
the way which was a hit. For write-backs it indicates 
the way that was written-back. 

CWAY is utilized by external tracking machines in 
order for the 82495XP tags to be accurately dupli- 
cated. 


7.15.2 WHEN DRIVEN 

CWAY is valid together with CADS# and remains 
valid until CRDY# or CNA#. 


7.15.3 RELATION TO OTHER SIGNALS 

CWAY is valid with CADS#. 


7.16 CW/R# 

Cache Write/ Read 

Indicates whether current cycle is write or read 
Output from 82495XP (pin E4) Cycle Control Signal 
Synchronous to CLK 

7.16.1 SIGNAL DESCRIPTION 

CW/R#, along with CD/C# and CM/10#, is a 
82495XP cycle definition signal. It indicates the type 
of bus cycle being requested of the MBC. CW/R# 
can be pipelined by the memory bus controller 
(CNA# input to the 82495XP). 


7.16.2 WHEN DRIVEN 

CW/R# is valid in the same CLK as CADS# and is 
valid until CRDY# or CNA#. 


7.16.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 
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7.17 DRCTM# 

Memory Bus Direct to [M] State 

Signals 82495XP to tag data direct to the [M] state, 
skipping the [E] and [S] states. 

Input to the 82495XP (pin M1) Cycle Attribute Signal 
Synchronous to CLK 

7.17.1 SIGNAL DESCRIPTION 

DRCTM# is an input to the 82495XP from the mem- 
ory bus. When sampled active at the end of the 
snooping window (SWEND# activation), the 
82495XP moves the line fill in progress directly to 
the [M] state. 

There are three cases In which this is useful. 

1. Simplifies External State Tracker 

External trackers can only track the [M], [S], and 
[I] states. The [E] state can not be tracked exter- 
nally since cache write hits internally change [E] 
state lines to [M] state. DRCTM # can be used to 
eliminate the [E] state from the MESI protocol. 

2. Read For Ownership 

During a write miss with allocation the write may 
go to the memory buffer and not be written to 
memory. A read from memory, in conjunction with 
the MFRZ# signal asserted, reads the data to fill 
around the bytes written by the CPU. The con- 
tents of the memory buffer are then entered Into 
the cache. The cache would normally tag this 
data In the [E] state (The cache assumes the 
write went to main memory). The system has the 
option of never completing the write to memroy 
(increases performance by completing the alloca- 
tion quicker). If the write Is not performed to 
memory, the cache is the only owner of the new 
data and therefore the cache entry must be 
tagged to the [M] state. 

3. Cache to Cache Transfer 

A cache to cache transfer may occur as a result 
of a snoop. For example. If CPU/Cache 1 per- 
forms a read from main memory and CPU/Cache 
2 flags it as a snoop hit to an [M] state line. To 
expedite the transfer, the system may perform 
the writeback from CPU /Cache 2 directly to 
CPU/Cache 1, bypassing memory. CPU/Cache 1 
assumes the write-back went to memory and 
would normally tag the line to the [S] state. Since 
the system did not perform the write to memory, 
the system should drive DRCTM# to force the 
line to the [M] state. In addition, the line should 
be Invalidated in CPU/Cache 2 by driving 
SNPINV. 


7.17.2 WHEN SAMPLED 

DRCTM# is synchronous to CLK. It is only sampled 
when SWEND# is active (the end of the snooping 
window). When SWEND# is inactive DRCTM# is 
Ignored and does not have to meet setup and hold 
times. 


7.17.3 RELATION TO OTHER SIGNALS 

DRCTM# (direct to [M]) and MWB/WT# (write poli- 
cy) combine to define the memory bus attributes and 
are sampled on CLK at the end of the snooping win- 
dow (SWEND# activation). 

If MRO# Is sampled active during KWEND#, 
DRCTM# is ignored. 


7.18 FLUSH# 

Flush 

Causes a 82495XP Cache Flush 

Input to 82495XP (N4) Cache Synchronization Sig- 
nal 

Asynchronous input 

7.18.1 SIGNAL DESCRIPTION 

This signal causes the 82495XP to flush all its modi- 
fied lines to main memory. The flushing of modified 
lines require the 82495XP to perform back-invalida- 
tion and Inquire cycles to the CPU. At the end of 
flush, the 82495XP tag array will be completely inval- 
idated. 

FLUSH# will invalidate the entire 82495XP tag ar- 
ray. It takes two clocks to look-up and invalidate a 
tag entry. The 82495XP will also invalidate tags in 
the CPU cache by running back-invalidation cycles. 
If the 82495XP tag state is modified, the 82495XP 
will run inquire cycles to the i860 XP CPU to see Is 
the line Is modified in Its cache. If so, the i860 XP 
CPU will write back the line Into the 82495XP write 
buffer. All modified 82495XP cache lines must be 
written to memory. 

7.18.2 WHEN SAMPLED 

FLUSH can be asserted at any time. The 82495XP 
will complete all outstanding transactions on the 
CPU and memory bus before beginning the 
FLUSH# process. The memory bus controller does 
not have to prevent FLUSH# during locked cycles 
because the 82495XP will complete its locked trans- 
action before the FLUSH# process will begin. 
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Once a FLUSH# operation has begun, the FLUSH# 
signal is ignored until the operation completes. If 
RESET is activated while the FLUSH# operation is 
in progress, the FLUSH# operation will be aborted 
and the RESET immediately executed. 

FLUSH# is an asynchronous input. FLUSH# must 
have a pulse width of 2 CLK’s in order to guarantee 
82495XP recognition. 

7.18.3 RELATION TO OTHER SIGNALS 

To initiate a FLUSH#, the 82495XP will complete all 
pending cycles and prohibit the processor from issu- 
ing any further ADS#’s while the FLUSH# is in 
progress. The FSIOUT # output signal Is used to in- 
dicate the start and end of the FLUSH # operation. It 
will become active when the FLUSH# signal is inter- 
nally recognized (all outstanding cycles have com- 
pleted) and will de-activate with the CRDY# of the 
last FLUSH # write-back. 

The memory bus controller supplies BRDY # to the 
CPU once FSIOUT# has gone inactive and the 
FLUSH Is complete. Once FLUSH# has begun, and 
FSIOUT# active, all CADS#’s and CRDY#’s corre- 
spond to write-backs caused by the FLUSH # opera- 
tion. 

The 82495XP can be snooped during FLUSH# cy- 
cles and the snooping protocols will be the same as 
that for any memory bus cycle. 


7.19 FPFLD# [FPFLDEN] 

External FIFO PFLD 

Indicates PFLD cycle during external PFLD FIFO 
mode 

Output of the 82495XP (J4) Cycle Control Signal 
Sync to CLK 


7.19.1 SIGNAL DESCRIPTION 

During RESET, this pin functions as the FPFLDEN 
configuration signal. The 82495XP can be config- 
ured to decode the i860 XP microprocessor’s PFLD 
cycles. The 82495XP supports 3 operational modes 
for PFLD cycle decoding, as defined by FPFLDEN 
and NCPFLD#: 

Mode # 1 . PFLD cycles are cached In the 82495XP. 

Mode #2. PFLD cycles are not cached in the 
82495XP, without an external PFLD ex- 
tension FIFO. 

Mode #3. PFLD cycles not cached in the 82495XP, 
with an external PFLD extension FIFO. 


Mode 

FPFLDEN 

NCPFLD# 

1 

0 

1 

2 

0 

0 

3 

1 

1 

Illegal Mode 

1 

0 


If mode 3 has been selected, the 82495XP allows 
the PFLD pipeline to be extended with an external 
FIFO. After RESET, when this mode has been se- 
lected, the FPFLD output will indicate that the re- 
quested cycle is a PFLD cycle. See Section 5.2.5 for 
more details. 


7.19.2 WHEN DRIVEN 

FPFLDEN Is sampled on RESET as in figure 7-1, 
with a setup time of 4 CPU clocks. In PFLD mode 
#3, the FPFLD# output Is valid in the same CLK as 
CADS# and remains valid until CRDY# or CNA#. 


7.19.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/10#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 


7.20 FSIOUT# 

Flush, Sync, Initialization Output 
Indicates the start and end of the Flush, 

Sync, and Initialization operations. 

Output of the 82495XP (D1) Cache Synchronization 
Signal 

Sync to CLK 

7.20.1 SIGNAL DESCRIPTION 

This signal indicates the start and the end of either a 
Flush, Sync, or Initialization (including self-test if re- 
quested) operation. These operations are mutually 
exclusive. This signal is activated when the 82495XP 
begins the operation and goes inactive upon com- 
pletion of the operation. 


7.20.2 WHEN DRIVEN 

This signal will be asserted whenever a Flush, Sync, 
or Initialization operation is internally recognized by 
the 82495XP and is in progress. 
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7.20.3 RELATION TO OTHER SIGNALS 


7.21.3 RELATION TO OTHER SIGNALS 


FSIOUT # active indicates that either Flush, Sync, or 
Initialization operation is in progress. Only one of 
these operations can be run within the 82495XP at a 
time. 


HIGHZ# shares a pin with MBALE. 82495XP out- 
puts are tristated if both HIGHZ# and SLFTST # are 
sampled active during reset. 


The table below shows the priorities of these three 
operations; 


Operation 

Trigger 

Priority 

initialization 

RESET 

Highest 

Flush 

FLUSH# 


Sync 

SYNC# 

Lowest 


If a trigger of higher priority occurs while a lower 
priority operation is running, the lower priority opera- 
tion is aborted and the higher priority one executed. 
If a trigger of lower priority occurs when a higher 
priority one is running, the lower priority trigger is 
ignored. Once a FLUSH# or SYNC# operation has 
begun, its trigger is ignored until the operation com- 
pletes. 


7.22 KLOCK# 

82495XP LOCK# 

Request to MBC of LOCKed cycle 

Output from 82495XP (pin C3) Cycle Control Signal 

Synchronous to CLK 


7.22.1 SIGNAL DESCRIPTION 


KLOCK# Indicates to the MBC that there is a re- 
quest to execute a locked cycle. This signal follows 
the CPU lock request. 


2 


KLOCK# Is simply a one-clock flow-through version 
of the CPU LOCK# signal. The 82495XP will acti- 
vate KLOCK# with CADS# of the first cycle of a 
LOCKed operation and it will remain active until the 
CADS# of the last cycle of the LOCKed operation. 


When a higher priority operation aborts a lower prior- 
ity one, FSIOUT # remains active. 

Since RESET, FLUSH# and SYNC# are all asyn- 
chronous, FSIOUT# will be activated when the 
82495XP is actually Internally executing the opera- 
tion. 


7.21 HIGHZ# 

High Impedance Outputs 
Causes 82495XP outputs to be tristated 
Input to 82495XP (pin P4) Test Signal 
Synchronous to CLK 


7.21.1 SIGNAL DESCRIPTION 

The 82495XP will enter self-test If both SLFTST # is 
active and HIGHZ# is Inactive during reset. If 
SLFTST# Is sampled active and HIGHZ# Is sam- 
pled active during reset, the 82495XP floats all its 
outputs until the 82495XP is reset again. Activation 
of HIGHZ# without SLFTST# does nothing. 


Note that If the memory bus is pipelined, there may 
be a situation in which KLOCK# deactivation Is In 
the same CLK as Its new activation (together with 
CADS#). In this case KLOCK# won’t go inactive 
between back-to-back locked sequences. KLOCK# 
will never go inactive if the CPU LOCK# does not go 
inactive. The 82495XP will not open arbitration win- 
dows between back-to-back locked sequences; It Is 
the memory bus controller’s responsibility to imple- 
ment this functionality by detecting a LOCKed write 
followed by a LOCKed read. 

KLOCK# activation is not qualified by the tag array 
look-up (hit/miss indications); therefore, KLOCK# 
can be active before CADS# Is asserted. 

7.22.2 WHEN DRIVEN 

KLOCK# assertion is a flow-through of 1 CLK from 
the CPU LOCK# after the 82495XP completes all 
pending cycles. KLOCK# deassertion Is a flow- 
through of 1 CLK from the CPU LOCK# signal, and 
must be at least 1 CLK after the last CADS# of a 
LOCKed sequence. KLOCK# is always driven to a 
valid logic level. 


7.21.2 WHEN SAMPLED 

HIGHZ# is sampled like figure 7-1 with a setup time 
of 10 CPU clocks. HIGHZ# Is then a don’t care until 
the 82495XP reset sequence is complete (with FSI- 
OUT # going inactive) where It becomes the MBALE 
pin. 


7.22.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, CW/ 
R#, CM/10#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 
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7.23 KWEND# 

Cacheability Window End 

Closes 82495XP Cacheability Window 

Input to 82495XP (pin M4) Cycle Progress Signal 

Synchronous to CLK 

7.23.1 SIGNAL DESCRIPTION 

KWEND# is a cycle progress input to the 82495XP 
that, when active, closes the cacheability window 
and causes the cacheability attributes MKEN# and 
MRO# to be sampled. 

KWEND# is sampled by the 82495XP after BGT# 
has been sampled active. KWEND# should be as- 
serted by the MBC once the memory address has 
been decoded and cacheability (MKEN#) and read- 
only (MRO#) attributes have been determined. 

The sampling of KWEND# active allows SWEND# 
to be sampled. Resolving KWEND# quickly allows 
the non-cacheable window between BGT# and 
SWEND# to be closed more quickly. KWEND# ac- 
tivation also allows the 82495XP to start allocations 
and begin replacements. 

7.23.2 WHEN sampled 

KWEND# is sampled by the 82495XP on the clock, 
or after, BGT# has been sampled active. Once 
KWEND# is sampled active it is not sampled again 
until BGT# of the next cycle. KWEND# need not 
follow setup and hold times if it is not being sampled. 

BGT#, KWEND# and SWEND# may be asserted 
on the same clock edge. 

KWEND# need only be activated for those cycles 
which require the sampling of MKEN# and MRO#. 
These are line-fills and write cycles with potential 
allocation. 

7.23.3 RELATION TO OTHER SIGNALS 

KWEND# is sampled on or after BGT # and allows 
the sampling of SWEND#. KWEND# activation 
causes the sampling of MKEN# and MRO#. 

According to cycle progress implication rules, 
CRDY# must be at least one clock after KWEND# 
for line fills and write-through cycles with potential 
allocate. 

KWEND# shares a pin with CFG2. 


7.24 MALE 

Memory Address Latch Enable 
Tristates/ Enables Memory Address Outputs 
Input to 82495XP (pin 02) Cycle Control Signal 
Asynchronous 

7.24.1 SIGNAL DESCRIPTION 

The 82495XP contains an address latch which con- 
trols the last stage of the 82495XP address output. It 
Is controlled by four signals: MAOE#, MBAOE#, 
MALE, and MBALE. The signals MALE and MBALE 
control the latching of the entire 82495XP address 
where MBALE controls the subline portion and 
MALE controls the rest. 

MALE is provided so that the memory bus controller 
can control when the next pipelined address is driv- 
en. With MALE high, the 82495XP address latch is in 
’flow-through’ mode and the 82495XP address is 
available at the memory bus. Changes in the 
82495XP address are seen Immediately at the mem- 
ory bus. When MALE is driven low the address at 
the latch input is latched. Any subsequent address 
driven by the 82495XP will not be seen at the memo- 
ry bus outputs until MALE is driven high again. 

MALE will latch 82495XP addresses regardless of 
the state of MAOE#. If MAOE# is Inactive, MALE 
will still operate the latch properly, but the memory 
bus will be tristated. 


7.24.2 WHEN SAMPLED 

MALE is asynchronous and can be asserted and 
deasserted at any time. MALE should always be 
driven to a valid state since it directly controls the 
operation of the address latch. 

7.24.3 RELATION TO OTHER SIGNALS 

MALE together with MBALE control the latching of 
the entire 82495XP output address. The other latch 
control signals, MAOE# and MBAOE#, provide the 
memory bus controller complete command over the 
address outputs. MAOE# and MBAOE# do not af- 
fect the operation of MALE or MBALE. 

MALE shares a pin with the WWOR# configuration 
pin. 
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7.25 MAOE# 

Memory Address Output Enable 
Tristates/Enables Memory Address Outputs 
Input to 82495XP (pin S4) Cycle Control Signal 
Asynchronous except during snoop cycles 


7.25.1 SIGNAL DESCRIPTION 

The 82495XP has an address latch which is con- 
trolled by a latch Input, MALE, and an output enable 
input, MAOE#. MAOE# has two main functions. 
One, driving MAOE# active will enable the 82495XP 
to drive it’s address lines MTAGO-11, MSETO-10, 
and MCFAO-6. Two, MAOE# is a qualifier for snoop 
cycles and must be inactive for the 82495XP to 
snoop. 

In general, MAOE# should be active if its 82495XP 
is the current bus master. When that 82495XP gives 
up the bus, MAOE# should be inactive to float the 
address lines and allow another master to snoop. 

MAOE# controls the output of the 82495XP ad- 
dress except the subline (burst) portion. This portion 
has a separate output control: MBAOE#. 


7.25.2 WHEN SAMPLED 

MAOE# Is an asynchronous input (except during 
snoop cycles) and always has full control over the 
address output. For this reason, MAOE# must al- 
ways be driven to a valid state. 

The 82495XP does, however, sample MAOE# dur- 
ing snoop cycles. When sampled, MAOE# must 
meet proper setup and hold times. In synchronous 
snoop mode MAOE# is sampled on a CLK edge. In 
clocked mode MAOE# Is sampled on a SNPCLK 
edge. In strobed mode MAOE# is sampled with the 
falling edge of SNPSTB#. If MAOE# is sampled ac- 
tive, the snoop will be ignored. This allows 
SNPSTB# to share a common line for multiple 
82495XPS. 

MAOE# need not meet any setup or hold time if It Is 
not being sampled during a snoop cycle. 

7.25.3 RELATION TO OTHER SIGNALS 

MAOE# together with MBAOE# control the entire 
82495XP address. Both signals are asynchronous 
and thus need never be synchronized to any clock. 
Both signals are, however, sampled during snoop 
cycles and require proper setup and hold times in 
these situations. 


MALE and MAOE# together provide full control 
over the 82495XP address output latch. 


7.26 MBALE 

Memory Burst Address Latch Enable 
Tristates/Enables Memory Burst Address Outputs 
Input to 82495XP (pin P4) Cycle Control Signal 
Asynchronous 


7.26.1 SIGNAL DESCRIPTION 


The 82495XP address latch is controlled by four sig- 
nals: MAOE#, MBAOE#, MALE, and MBALE. The 
signals MALE and MBALE control the latching of the 
entire 82495XP address where MBALE controls the 
subline portion and MALE controls the rest. 



MALE and MBALE are provided so that the memory 
bus controller has complete flexibility when the next 
address is driven. With MBALE high, the subline por- 
tion of the 82495XP address latch is in “flow- 
through” mode and the 82495XP subline address Is 
available at the memory bus. Changes in the 
82495XP subline address are seen Immediately at 
the memory bus. When MBALE is driven low the 
subline address at the latch Input is latched. Any 
subsequent subline address driven by the 82495XP 
will not be seen at the memory bus outputs until 
MBALE is driven high again. 


MBALE will latch 82495XP addresses regardless of 
the state of MAOE# or MBAOE#. If MBAOE# Is 
inactive, MBALE will still operate the latch properly, 
but the subline portion of the memory bus will be 
tristated. 


Separate line and subline address latch controls are 
provided so that the latch outputs may be driven at 
different times. The table below indicates the subline 
address bits for each line size. 


Line Size (Bytes) 

Subline Address 

32 

A3, A4 

64 

A4, A5 

128 

A5, A6 


7.26.2 WHEN SAMPLED 

MBALE is asynchronous and can be asserted and 
deasserted at any time. MBALE should always be 
driven to a valid state since It directly controls the 
operation of the address latch. 
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7.26.3 RELATION TO OTHER SIGNALS 

MALE together with MBALE control the latching of 
the entire 82495XP output address. The other latch 
control signals, MAOE# and MBAOE#, provide the 
memory bus controller complete command over the 
address outputs. MAOE# and MBAOE# do not af- 
fect the operation of MALE or MBALE. 

MBALE shares a pin with the HIGHZ# configuration 
pin. 


7-27 MBAOE# 

Memory Burst Address Output Enable 
Tristates/Enables Memory Subline Address Outputs 
Input to 82495XP (pin P6) Cycle Control Signal 
Asynchronous except during snoop cycles 

7.27.1 SIGNAL DESCRIPTION 

The 82495XP address latch Is controlled by four sig- 
nals: MAOE#, MBAOE#, MALE, and MBALE. 
MAOE# and MBAOE# are the output enables of 
this latch for the entire 82495XP address. Specifical- 
ly, MBAOE# controls the subline address portion 
and MAOE# controls the rest. 

MBAOE# has two functions. One, It can tristate the 
subline portion of the address separately from the 
rest of the address. Since the 82495XP does not 
sequence through burst addresses, the memory sys- 
tem may wish to provide the burst count. This re- 
quires that the 82495XP address burst portion be 
tristated after the first transfer. The Subline Address 
table appears In Section 7.26, MBALE. 

Two, MBAOE# is sampled during snoop cycles. If 
MBAOE# Is sampled inactive, the snoop write back 
cycle, if any, will begin at the subline address provid- 
ed. If MBAOE# Is sampled active, the snoop write 
back will begin at subline address 0. This allows 
snoop write backs to begin at the snooped subline 
address and progress through the normal burst or- 
der. 


7.27.2 WHEN SAMPLED 

Like MAOE#, MBAOE# is asynchronous except 
during snoop cycles and can be asserted or deas- 
serted at any time. Since MBAOE# has direct con- 
trol over the address latch, it must always be driven 
to a valid state. 

MBAOE# is ,however, sampled during snoop cy- 
cles. In synchronous snooping mode, MBAOE# 


must meet proper setup and hold times to CLK’s 
rising edge. In clocked mode, MBAOE# must meet 
setup and hold times to SNPCLK’s rising edge. In 
strobed mode, MBAOE# must meet setup and hold 
times to SNPSTB#’s falling edge. 

If MBAOE# is not being sampled for a snoop, ie. 
SNPSTB# is not asserted, MBAOE# need not meet 
any setup or hold time. 

7.27.3 RELATION TO OTHER SIGNALS 

MAOE# and MBAOE# control the entire 82495XP 
address output asynchronously. This address latch 
Is completely controlled by MALE, MBALE, MAOE#, 
and MBAOE#. 

MBAOE# Is only sampled by the 82495XP during 
snoop cycles with SNPSTB#. 


7.28 MBRDY# 

Memory Burst Ready 

Burst Ready Input to 82490XP memory buffers 
Input to 82490XP (pin 22) Cycle Progress Signal 
Synchronous to MCLK 


7.28.1 SIGNAL DESCRIPTION 

When In clocked memory bus mode, MBRDY# (with 
MSEL# active) is used to advance the memory 
burst counter for the 82490XP buffer In use. This 
causes either new data to be latched from the mem- 
ory bus (read cycle), or new data to be driven from 
the 82490XP buffer (write cycle). MBRDY# is sam- 
pled on all MCLK edges in which MSEL# Is sampled 
active and has no relation to CLK. In strobed mode, 
MBRDY# must be tied high as MISTB/MOSTB 
strobes data in/out of the 82490XP. 

For write cycles, the first piece of write data Is avail- 
able at the MDATA pins. MBRDY# assertion with 
MSEL# active causes the next 32, 64, or 128-blt 
slice of write data to be available. If only one slice is 
required, MSEL# and MBRDY# need never go ac- 
tive. 

For read cycles, the first piece of read data flows 
through to the CPU. MBRDY# assertion with 
MSEL# active causes the next slice of memory data 
to be latched in the 82490XP buffer. BRDY # asser- 
tion will allow this data to be available on the CPU 
bus and latch it into the CPU. For cacheable cycles, 
MBRDY# needs to be asserted 4 or 8 times de- 
pending on the cache configuration. 
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7.28.2 WHEN SAMPLED 7.29.3 RELATION TO OTHER SIGNALS 

MBRDY# is sampled on all MCLK edges where Address and cycle specification signals (MSETO- 

MSEL# Is sampled active. In this way MSEL# quali- MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 

ties the MBRDY# input. If MSEL# Is sampled inac- CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 

tive, MBRDY# need not follow setup and hold times NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 

to MCLK. valid with CADS#. 


7.28.3 RELATION TO OTHER SIGNALS 

MBRDY# is qualified by the MSEL# input. 
MBRDY# advances the memory burst counter for 
the 82490XP in use which either inputs or outputs 
data through MDATA. 

MEOC# switches the 82490XP buffers to the next 
pending cycle, so the last MBRDY# must come be- 
fore or on the clock of MEOC# assertion. 


7.29 MCACHE# 

82495XP Internal Cacheability 

Indicates cycle cacheability attribute 

Output from 82495XP (pin C2) Cycle Control Signal 

Synchronous to CLK 

7.29.1 SIGNAL DESCRIPTION 

MCACHE# is driven by the 82495XP and indicates 
that the current cycle may be cached. Data cachea- 
bility is determined later in the cycle by MKEN# as- 
sertion. MCACHE# is asserted for allocation, re- 
placement write-back cycles, and during cacheable 
read-miss cycles, (ie. read-miss cycles in which PCD 
is not asserted). It is not asserted for 10, special, or 
locked cycles. 


Cycle Type 

MCACHE# 

Posted Writes 

1 

Write Backs 

0 

Read, PCD = 0 

0 

Read, PCD = 1 

1 

Allocation 

0 

I/O Cycles 

1 

Locked Cycles 

1 


7.29.2 WHEN DRIVEN 

MCACHE# is valid in the same CLK as CADS# and 
remains valid until CRDY# or CNA#. 


7.30 MCFA0-MCFA6 
MSET0-MSET10 
MTAG0-MTAG11 

MCFA0-MCFA6 Memory Configuration Address I/O 
MSET0-MSET10 Memory Set Address I/O 
MTAG0-MTAG11 Memory Tag Address I/O 
82495XP Memory Address Inputs/Outputs 

Input/Output of 82495XP (pins N14, P7-P15, 06- 
016, R4, R14-R17, SI 4-SI 7) Cycle Control Sig- 
nals 

Input Synchronous to CLK, SNPCLK, or SNPSTB#. 
Output from CLK, MAOE# active or MALE high. 

7.30.1 SIGNAL DESCRIPTION 

MSETO-10, MTAGO-11, and MCFAO-6 provide the 
complete 30 bit address input/output interface of 
the 82495XP to the memory bus. Together they 
span the entire CPU address range A2-A31. De- 
pending on the cache configuration, each pin repre- 
sents a different CPU address line (see configura- 
tion section for details). 

MSETO-10, MTAGO-11, and MCFAO-6 pass 
through a 82495XP output latch. The latching of this 
latch is controlled by MALE/MBALE, and the output 
of this latch is controlled by MAOE#/MBAOE#. 

With MAOE#/MBAOE# active, MSET/MTAG/ 
MCFA are 82495XP outputs. They are valid at the 
start of a memory bus cycle at the input of the 
82495XP address latch. If MALE/MBALE is high 
(flow-through) and MAOE#/MBAOE# is active 
(outputs enabled), they are driven to the memory 
bus with CADS#. 

If a new cycle starts and MALE/MBALE is low, the 
previous address remains valid at the 82495XP 
MSET/MTAG/MCFA outputs. Once MALE/MBALE 
goes high, the new address flows through with the 
appropriate propagation delay (MSET/MTAG/ 
MCFA address valid delay from MALE/MBALE go- 
ing high). The new address will be driven to the 
82495XP MSET/MTAG/MCFA outputs if MAOE#/ 
MBAOE# is active. 
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If a new cycle starts, MALE/MBALE is high, and 
MAOE#/MBAOE# is inactive, the 82495XP MSET/ 
MTAG/MCFA outputs will remain tristated. Once 
MAOE#/MBAOE# is asserted, the new address 
flows through with the appropriate propagation delay 
(MSET/MTAG/MCFA address valid from MAOE#/ 
MBAOE# going active). 

MSETO-10, MTAGO-11, and MCFAO-6 are used 
as inputs to the 82495XP during snoop cycles. Here, 
MAOE# /MBAOE# is inactive. MSET/MTAG/ 
MCFA are sampled by the 82495XP during snoop 
initiation just like the other snoop attributes. 


7.30.2 WHEN SAMPLED 

If MALE/MBALE is high and MAOE#/MBAOE# is 
low, MSETO-10, MTAGO-11, and MCFAO-6 are 
valid with CADS# with a timing reference to CLK. 
Otherwise, they are asserted with a delay from 
MALE/MBALE high or MAOE# /MBAOE# active. 

MSETO-10, MTAGO-11, and MCFAO-6 change 
once CNA# or CRDY# is sampled active. MSETO- 
10, MTAGO-11, and MCFAO-6 have a float delay 
from MAOE# /MBAOE# going inactive. These out- 
puts are undefined after CRDY# /CNA# assertion 
and before the next CADS# assertion. 

As inputs during snoop cycles (SNPSTB# asserted), 
they must be sampled like other snoop attributes 
with proper setup and hold times. In synchronous 
snoop mode this is with respect to CLK; in clocked 
mode, this is with respect to SNPCLK; and in 
strobed mode this is with respect to SNPSTB# fall- 
ing edge. 

If MAOE# is inactive and SNPSTB# is not asserted 
(no snoop), MSETO-10, MTAGO-11, and MCFAO- 
6 need not meet any setup or hold time. 

7.30.3 RELATION TO OTHER SIGNALS 

MSETO-10, MTAGO-11, and MCFAO-6 are assert- 
ed with CADS# so they are valid when CADS# is 
sampled active. This is true as long as MALE/MBA- 
LE is high and MAOE# /MBAOE# is active. If 
MSETO-10, MTAGO-11, and MCFAO-6 have been 
asserted but are blocked by MALE/MBALE or 
MAOE# /MBAOE#, they are asserted from MALE/ 
MBALE going high or MAOE# /MBAOE# going ac- 
tive. 

MSETO-10, MTAGO-11, and MCFAO-6 are deas- 
serted or changed with CADS# or CNA# active. 
They may also be floated with MAOE# going inac- 
tive. 


MSETO-10, MTAGO-11, and MCFAO-6 are used 
as inputs during snoop cycles. They are sampled 
with SNPSTB# like any other snoop attribute signal. 


7.31 MCLK 

Memory Bus Clock 

Input to the 82490XP (Pin 26) 

7.31.1 SIGNAL DESCRIPTION 

In a clocked memory bus mode, this pin provides the 
memory bus clock. Memory bus signals and memory 
bus data are sampled on the rising edge of MCLK. 
Memory bus write data is driven off MCLK or 
MOCLK depending upon the configuration. MCLK 
has no relation to CLK. 


7.31.3 RELATION TO OTHER SIGNALS 

MCLK shares a pin with MISTB. 

In clocked memory bus mode, the MDATA7- 
MDATAO, MSEL#, MFRZ#, MBRDY#, MZBT#, 
and MEOC# pins are sampled synchronously with 
the rising edge of MCLK. In a clocked memory bus 
write, MDATA7-MDATA0 are driven synchronous 
with MCLK or MOCLK. 

MOCLK is a delayed version of MCLK. If a clocked 
memory bus configuration is chosen, and the 
MOCLK rising edge is detected by the 82490XP af- 
ter RESET, data will be driven off of MOCLK rather 
then MCLK. Only data is effected by MOCLK. 
MOCLK is used to allow the system designer to in- 
crease the minimum output time of MDATA relative 
to MCLK. 


7.32 MDATA0-MDATA7 

Memory Bus Data Pins 

82490XP Connection to the Memory Bus 

Input/Output of 82490XP (pins 1 8, 1 4, 1 0, 6, 1 6, 1 2, 
8, 4) 

Synchronous to CLK or MCLK or MOCLK or MISTB 
or MOSTB. 


7.32.1 SIGNAL DESCRIPTION 

MDATAO-7 is the 82490XP data bus connection to 
the memory bus. All or part of these pins will be used 
depending on the cache configuration. These pins 
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are directly controlled by the MDOE# input. With 
MDOE# inactive, these pins are tristated and may 
be used as inputs. 

For write cycles, the 82495XP asserts CDTS# to 
indicate that data will be available at the MDATA 
pins or in its buffer. Data is output with respect to 
CLK, MCLK, MOCLK, or MEOC# and is strobed 
with MBRDY#. In strobed memory bus mode, data 
is output using MOSTB. 

For read cycles, CDTS# indicates that the CPU data 
path will be available for read data in the next clock. 
BRDY# reads data into the CPU from the 82490XP. 
Data is read into the 82490XPs through MDATA us- 
ing MBRDY# or MISTB. 


7.32.2 WHEN DRIVEN 

When the CPU or 82495XP initiates a write cycle, 
the write data is written to the appropriate 82490XP 
buffer and CDTS# is asserted. If MDOE# is active, 
that first piece of write data will be available at the 
MDATA pins with some delay from the CPU CLK 
edge that CDTS# is asserted. Subsequent pieces of 
write data are output with some delay from MCLK or 
MOCLK (mode dependent) from the edge that 
MBRDY# is sampled active. In strobed mode, sub- 
sequent data is output with MOSTB assertion. 

MDATA has no value before CDTS# assertion, after. 
MEOC# with no pending cycle, or with MDOE# in- 
active. 


MDOE# must be inactive for MDATA to, read data. 
CDTS# assertion by the 82495XP indicates that the 
read path is available in the next clock. Data must be 
read Into MDATA with respect to MCLK or MISTB 
and must follow proper setup and hold times if 
MBRDY# is active or MISTB is changing. 

The memory bus controller must account for the 
large setup time required to read data into the CPU. 
If properly done, data can be read into MDATA by 
asserting MBRDY# and in the next full CPU clock 
read into the CPU using BRDY#. 


7.33 MDOE# 

Memory Data Output Enable 
TrIstates/ Enables Memory Data Outputs 
Input to 82490XP (pin 20) Cycle Control Signal 
Asynchronous 



7.33.1 SIGNAL DESCRIPTION 

MDOE# is an input to the 82490XP that, when as- 
serted, causes the 82490XP to drive its MDATAO- 
MDATA7 outputs. When MDOE# is inactive, these 
lines are floated and may be used as inputs to the 
82490XP. MDOE# is not sampled by any clock and 
Is a direct connection to the 82490XP memory ouput 
driver. 


7.33.2 WHEN SAMPLED 


For read cycles, the 82495XP asserts CDTS# the 
clock before the MDATA path is available for read 
data. MDOE# must be inactive for the 82490XP to 
read data. Read data is strobed into the 82490XP by 
asserting MBRDY# on MCLK edges. MEOC# will 
latch the last piece data as it switches buffers. In 
strobed mode, data is read by MISTB. Data that Is 
read into MDATA must meet proper setup and hold 
times. 

Data at the MDATA inputs need not follow setup and 
hold times to MCLK edges that sample MBRDY# 
inactive. 


7.32.3 RELATION TO OTHER SIGNALS 

CDTS# indicates that write data is in the 82490XP 
buffers. If MDOE# is active, write data is available at 
MDATA some time after CDTS# or MEOC# is sam- 
pled active. Subsequent write data is available at 
MDATA after MBRDY# assertion or MOSTB chang- 
ing. 


Since MDOE# is a direct connection to the 
82490XP memory output drivers, MDOE# must al- 
ways be driven to a valid level. With MDOE# inac- 
tive, data in the 82490XP’s may be driven to MDATA 
outputs with some propagation delay from MDOE# 
going active. Similarly, there is some float delay from 
MDOE# going inactive. 

MDOE# must be inactive for the 82490XP to read 
memory data. 

7.33.3 RELATION TO OTHER SIGNALS 

MDOE# has no relation to MCLK, MOCLK, or 
MOSTB. Since MDOE# controls the final stage of 
the MDATA output buffers, it has no effect on any 
other signal of the 82490XP. 


7.34 MEMLDRV 

Memory Low Capacitance Drivers 

Selects the Low Capacitance Drivers for the 
82495XP and the 82490XP 
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Inputs to 82495XP and 82490XP (pins Q4, 24) Con- 
figuration Signal 

Synchronous to CLK 

7.34.1 SIGNAL DESCRIPTION 

MEMLDRV is a pin on both the 82495XP and 
82490XP that, when high during reset, select normal 
driving memory output buffers. If this pin is driven 
low at reset, the high capacitance drivers are select- 
ed. Specifically, these are the 82495XP address out- 
puts to the memory bus, and the 82490XP MDATA 
outputs. The normal output drivers are designed to 
drive up to 50 pF loads. The high capacitance driv- 
ers can drive up to 100 pF without derating. 


7.34.2 WHEN SAMPLED 

MEMLDRV is sampled like figure 7-1 with a setup 
time of 4 CPU clocks for the 82495XP and 1 CPU 
clock for the 82490XP. On the 82495XP, MEMLDRV 
becomes the SYNC# Input once FSIOUT# goes 
inactive. On the 82490XP, MEMLDRV becomes the 
MFRZ# signal which is sampled after the first mem- 
ory cycle begins. 


7.34.3 RELATION TO OTHER SIGNALS 

MEMLDRV shares a pin with SYNC# on the 
82495XP and MFRZ# on the 82490XP. 


7.35 MEOC# 

Memory End of Cycle 

Ends a cycle in 82490XP by switching buffers 
Input to 82490XP (pin 23) Cycle Control Signal 

Synchronous to MCLK or Asynchronous (strobed 
mode) 

7.35.1 SIGNAL DESCRIPTIONS 

MEOC# is an input to the 82490XP that ends the 
current cycle and switches memory buffers for new 
cycle. Switching to the next cycle does not cause 
information to be lost in the memory or CPU buffers 
in the 82490XP, but rather switches new buffers to 
the memory I/O bus of the 82490XP. 

MEOC# is provided so that the memory system, 
which Is synchronous to MCLK, can switch to a new 
cycle without synchronization. In clocked memory 
bus mode MEOC# is sampled with the rising edge 
of MCLK. In strobed memory bus mode the MEOC# 
function is performed with rising or falling edges of 
MEOC#. 


For read or write cycles, MEOC# may be activated 
on or after the clock edge of the last MBRDY# of 
the current cycle. If a cycle is pending (pipelining is 
used), the next cycle will flow-through with a propa- 
gation delay from MEOC# assertion. MEOC# is re- 
quired for all memory bus cycles. 

In addition to switching memory buffers, MEOC# 
does three other things. One, MEOC# activation 
causes the memory burst counter to be reset to its 
start value and if MSEL# is active, MZBT # is sam- 
pled. This allows MSEL# to stay active between cy- 
cles. Two, MEOC# activation during a write cycle 
causes MFRZ# to be sampled for the a subsequent 
allocation (line-fill). Three, MEOC# latches in the 
last slice of data (like MBRDY#) before switching 
buffers. 


7.35.2 WHEN SAMPLED 

In clocked memory bus mode, MEOC# is sampled 
on every MCLK edge. It must always observe setup 
and hold times to MCLK. In strobed memory bus 
mode, MEOC# is always sampled and must meet 
proper active/inactive times. 

7.35.3 RELATION TO OTHER SIGNALS 

MEOC# is provided so that a cycle may end on the 
memory bus before CRDY# can be asserted. The 
implication rules surrounding MEOC# are: 

1. MEOC# ^ CRDY# 

2. MEOC# for cycle N + 1 ^ 2 clocks after CRDY# 
of cycle N 

3. MEOC# for cycle N + 1 ^ 2 clocks after last 
BRDY# of cycle N 

4. MEOC# ^ BGT# 

MEOC# active with MSEL# active causes the sam- 
pling of MZBT# and MFRZ#. 


7.36 MFRZ# 

Memory Data Freeze 

Freezes Memory Write Data in 82490XP Buffer 
Input to 82490XP (pin 24) Cycle Control Signal 
Synchronous to MCLK or Strobed 

7.36.1 SIGNAL DESCRIPTION 

MFRZ# is an input to the 82490XP that when active 
causes the 82490XP to “freeze” write data in the 
82490XP memory buffer and allow a subsequent al- 
location to fill a cache line around it. MFRZ# is pro- 
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vided so that an actual write to memory need not be 
done to perform an allocation. Using MFRZ# to per- 
form this dummy write cycle requires that the memo- 
ry bus controller put the allocated line into the “M” 
state. 

PALLC# must be active and MKEN# must be re- 
turned active for the write cycle to be turned into an 
allocation. MFRZ# is sampled when MEOC# goes 
active at the end of the write cycle. The subsequent 
line fill is then filled around the write data to com- 
plete the allocation. 


7.36.2 WHEN SAMPLED 

In clocked memory bus mode, MFRZ# is sampled 
with the MCLK rising edge that MEOC# is sampled 
active for all CPU write cycles. MFRZ# need only 
follow a proper setup and hold time in this situation. 

In strobed mode, MFRZ# is sampled with the falling 
edge of MEOC# for write cycles. MFRZ# need only 
follow a proper setup and hold time in this situation. 


When the device which controls the memory bus 
(the master) performs a memory access, a snoop is 
requested of all other caching devices on the bus 
(snoopers). An asserted MHITM# pin from any of 
the snooper 82495XPs alerts the master that main 
memory’s data is stale, and that the bus must be 
temporarily given to the snooper which has its 
MHITM# asserted so that the modified line can be 
written out to the memory bus. 


7.37.2 WHEN DRIVEN 


The snoop lookup is performed In the clock in which 
SNPCYC# is asserted. The MHITM# result for the 
snoop Is driven on the CLK following SNPCYC#, 
and remains valid until the next assertion of 
SNPSTB#. The MHITM# signal is not valid from 
SNPSTB# until the CLK after SNPCYC#. 


2 


7.37.3 RELATION TO OTHER SIGNALS 


MHITM# and MTHIT # outputs together indicate the 
results of a snoop lookup in the 82495XP. 


7.36.3 RELATION TO OTHER SIGNALS 

MFRZ# Is sampled with the MEOC# going active or 
being active for write cycles. MFRZ# is used so that 
a dummy write cycle can be performed. If an alloca- 
tion is done, DRCTM# must be asserted during the 
SWEND# window of the line fill to put the allocated 
line in the “M” state. 

MFRZ# shares a pin with the MEMLDRV configura- 
tion input. 


7.37 MHITM# 

Memory Bus Hit [M] 

Indicates snoop hit to modified line 

Output from 82495XP (pin H4) Snooping Signal 

Sync to CLK 

7.37.1 SIGNAL DESCRIPTION 

The MHITM# output is driven by the 82495XP dur- 
ing a snoop cycle to Indicate that the snooping ad- 
dress has hit a Modified line. If the signal is logic 
high, the snoop has not hit a modified line; if the 
signal is logic low, the snoop has hit a modified line. 
When a snoop hits a modified line, the 82495XP au- 
tomatically schedules a write-back of the hit modi- 
fied line to the memory bus. 


A 82495XP can accept a snoop request while per-^ 
forming memory bus transfers of its own. If a snoop 
is requested of a 82495XP while it Is performing a 
data transfer of its own, the results of the snoop may 
be delayed. If SNPSTB# is sampled at a 82495XP 
after it has received BGT# for its own cycle, the 
snoop lookup is performed (SNPCYC# active) after 
the SWEND# of its own cycle, and MHITM# is driv- 
en with valid results one CLK after SNPCYC# (see 
Sections 6.2.4 and 6.2.5). 


7.38 MISTB 

Memory Bus Input Strobe 

Strobes data Into the 82490XP 

Input to 82490XP (pin 22) Cycle Control Signal 

Asynchronous 


7.38.1 SIGNAL DESCRIPTION 

MISTB is an input to the 82490XP that, on rising or 
falling edges, causes the 82490XP to latch its MDA- 
TA inputs. MISTB is used in strobed memory bus 
mode. In clocked memory bus mode, MISTB is the 
MBRDY# input. 
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7.38.2 WHEN SAMPLED 

MISTB is always sampled by the 82490XP. MISTB 
must meet proper strobed mode active and inactive 
times. 

7.38.3 RELATION TO OTHER SIGNALS 

MISTB causes the latching of the 82490XP MDATA 
inputs in strobed mode. MISTB shares a pin with 
MBRDY#. 


7.39 MKEN# 

Memory Cache Enable 
Determines 82495XP and CPU cacheability 
Input to 82495XP (pin R1) Cycle Attribute Signal 
Synchronous to CLK 

7.39.1 SIGNAL DESCRIPTION 

MKEN# Is an input to the 82495XP that Is sampled 
at the closing of the cacheability window (KWEND# 
is sampled active). The 82495XP drives KEN # back 
to the CPU one clock after sampling the value of 
MKEN#. MKEN# thus determines whether the cur- 
rent cycle is cacheable in the 82495XP and in the 
CPU. 

For read cycles, if MCACHE# is active (cacheable), 
KEN# Is driven out of the 82495XP to the CPU to 
indicate cacheability. If MKEN# is sampled inactive 
during KWEND# activation, KEN# is brought inac- 
tive by the 82495XP, and the line will not be cache- 
able by the CPU or 82495XP. If MCACHE# Is inac- 
tive, the line will be non-cacheable regardless of 
MKEN#. PCD active will cause MCACHE# to be 
Inactive. 

MKEN# is sampled during write-through cycles that 
are potentially allocatable (PALLC# is active during 
the write cycle). If MKEN# is sampled active during 
KWEND# activation of the write cycle, an allocation 
will occur, and a line-fill will follow the write cycle. 
MKEN# during the line-fill is ignored. The MBC Indi- 
cates to the 82495XP that It intends to perform an 
allocation by asserting MKEN#. 

MKEN# must be sampled 1 clock before the first 
BRDY # assertion to make a line-fill non-cacheable 
to the CPU. 


7.39.2 WHEN SAMPLED 

MKEN# is sampled on the clock edge that 
KWEND# is first sampled active. In all other places 
MKEN# may violate setup and hold times. 


7.39.3 RELATION TO OTHER SIGNALS 

MKEN# and MRO# are sampled with KWEND# 
active. MKEN# must be sampled at least 2 clocks 
before BRDY# assertion to make a line-fill non^ 
cacheable. 


7-40 MOCLK 

Memory Data Output Clock 

Separate Clock Reference for Memory Data Output 

Input to 82490XP (pin 27) 

Asynchronous 


7.40.1 SIGNAL DESCRIPTION 

MOCLK is the latch enable for the 82490XP memory 
data outputs (MDATA). MOCLK controls the latching 
of a transparent latch which, when high, causes 
MDATA to be driven from MCLK. When low, MDATA 
is latched. MOCLK may only be used in clocked 
memory bus mode and only affects output data. It is 
provided so that a greater MDATA output hold time 
can be generated. 

To be used effectively, MOCLK must be a clock in- 
put that Is skewed from MCLK. The following picture 
shows how MOCLK has increased the hold time of 
the output burst data: 



7.40.2 WHEN SAMPLED 

MOCLK is sampled during and after RESET to de- 
termine whether output data should be driven from 
MCLK or MOCLK. If toggling, MOCLK controls the 
MDATA outputs with MCLK. If high, data is driven 
from MCLK alone. Regardless, Input data is never 
referenced to MOCLK. 

In strobed memory bus mode the MOCLK signal be- 
comes MOSTB. MOCLK is only used in clocked 
memory bus mode. 
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7.40.3 RELATION TO OTHER SIGNALS 

To be used effectively, MOCLK must be the same 
frequency as MCLK but be skewed. This effectively 
increases MDATA hold time to main memory. Main 
memory must sample the data on MCLK edges. 

MOCLK shares a pin with the MOSTB signal. 


7.41 MOSTB 

Memory Bus Output Strobe 

Strobes data out of 82490XP 

Input to 82490XP (pin 27) Cycle Control Signal 

Asynchronous 


7.41.1 SIGNAL DESCRIPTION 

MOSTB Is an input to the 82490XP that, on rising 
and falling edges, causes the 82490XP to output 
data through its MDATA outputs. MOSTB Is only 
used In strobed memory bus mode. In clocked mem- 
ory bus mode, MOSTB Is the MOCLK input. 


state, and causes the line to be non-cacheable to 
the CPU. Writes to read-only lines in the 82495XP 
are treated as write-misses that are non-allocatable 
(PALLC# is inactive). MRO# is a bit in each 
82495XP tag entry. 


Once MRO# is sampled active during KWEND# ac- 
tivation, KEN# to the CPU is driven inactive regard- 
less of the state of MKEN#. MKEN# does, howev- 
er, determine whether the 82495XP will cache the 
read-only line. Once MRO# is returned active, the 
CPU will only require the number of transfers as indi- 
cated by LEN and CACHE#. If MKEN# Is returned 
active, the 82495XP will require an entire cache line. 
82495XP read-only cache lines are filled to the [S] 
state. 

The line-fill portion of an allocation may be filled to 
the read-only state by returning MRO# active during 
KWEND# of the line-fill. MRO# is ignored during 
the write portion. 



If MRO# is returned active during KWEND#, 
DRCTM# and MWB/WT# are ignored during 
SWEND#. 


7.41.2 WHEN SAMPLED 

MOSTB is always sampled by the 82490XP. MOSTB 
must meet strobed mode active and inactive times. 


MRO# must be returned to the 82495XP at least 2 
clocks before BRDY# Is returned to the CPU so 
KEN# can be sampled properly. 

There is one Read-Only bit per tag In the 82495XP. 


7.41.3 REALTION TO OTHER SIGNALS 

MOSTB strobes data out of the 82490XP through 
MDATA. MOSTB shares a pin with MOCLK. 

7.42 MRO# 

Memory Read-Only 

Designates current line as read-only 

Input to 82495XP (pin J1) Cycle Attribute Signal 

Synchronous to CLK 

7.42.1 SIGNAL DESCRIPTION 

MRO# is an Input to the 82495XP that Is sampled at 
the closing of the cacheability window (KWEND# 
activation). If sampled active, it causes the current 
line fill to the 82495XP to be put in the read-only 


7.42.2 WHEN SAMPLED 

MRO# is sampled on the first clock that KWEND# 
Is sampled active. In all other clocks, MRO# need 
not follow setup and hold times. 

7.42.3 RELATION TO OTHER SIGNALS 

MRO# and MKEN# are sampled with KWEND# 
activation. MRO# must be returned at least 2 clocks 
prior to the first BRDY#. 

7.43 MSEL# 

Memory Buffer Chip Select 
Selects 82490XP, Causes Sampling of MZBT # 
Input to 82490XP (pin 25) Cycle Control Signal 
Synchronous to MCLK or Strobed 
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7.43.1 SIGNAL DESCRIPTION 

MSEL# is an input to the 82490XP that has 3 main 
functions. One, MSEL# active qualifies the 
MBRDY# input to the 82490XP. If MSEL# is inac- 
tive for a particular 82490XP, MBRDY# will not be 
recognized by that 82490XP. 

Two, MSEL# going active causes the sampling of 
MZBT # for the next transfer. 

Three, MSEL# going inactive resets the 82490XP 
Internal memory burst counter. The 82490XP con- 
tains a memory burst counter that counts through 
the CPU burst order with each MBRDY# assertion 
and increments a pointer to the 82490XP memory 
buffer being accessed. 

MSEL# going inactive will reset this burst counter to 
its original burst value. By resetting this counter be- 
fore MEOC# assertion, all information currently be- 
ing read Into the 82490XP is lost, but Information 
that is being written out is maintained and may be 
rewritten. 

In general, MSEL# may stay inactive for single 
transfer cycles such as posted 64-bit write cycles. 
Once active, MSEL# need not go inactive as the 
burst counter Is reset with MEOC# activation. Since 
MZBT# may also be sampled with MEOC#, it is 
possible to leave MSEL# asserted throughout most 
basic transfers. 

MSEL# or MEOC# must be used to reset the burst 
counter before any transfer begins. If transfers are 
interrupted (by a snoop hit before BGT # assertion 
for example), MSEL# must be brought inactive so 
the burst counter may be reset for the snoop write 
back. 

MSEL# must be sampled inactive for at least 1 
MCLK after reset. This resets the memory burst 
counter for the first transfer. 


7.43.2 WHEN SAMPLED 

In clocked memory bus mode, MSEL# is sampled 
with all rising edges of MCLK. In this mode, if 
MSEL# Is sampled inactive, the memory burst 
counter is reset and MZBT # is sampled. If MSEL# 
is sampled active and MBRDY# Is sampled active, 
the memory burst counter is incremented. Since it is 
constantly sampled with MCLK, MSEL# must al- 
ways be driven to a known state and must always 
meet setup and hold times to every MCLK edge. 


In strobed mode, MSEL# falling edge causes the 
sampling of MZBT #. While MSEL# Is active, MISTB 
and MOSTB cause the memory burst counter to be 
Incremented. The rising edge of MSEL# causes the 
memory burst counter to be reset. 

MSEL# must be inactive sometime after RESET be- 
fore the first transfer to initialize the burst counter. 


7.43.3 RELATION TO OTHER SIGNALS 

MSEL# causes the sampling of MZBT#, and quali- 
fies the use of MBRDY#, MOSTB, and MISTB. 
Since MSEL# acts as a qualifier for these signals, 
MSEL# may be asserted at the same time as 
MBRDY#, MOSTB, or MISTB. 


7.44 MTHIT# 

Memory Bus Tag Hit 
Indicates snoop hit 

Output from 82495XP (pin G3) Snooping Signal 
Sync to CLK 

7.44.1 SIGNAL DESCRIPTION 

The MTHIT# output Is asserted by the 82495XP 
during snoop cycles to indicate that the snoop ad- 
dress has hit a line In the 82495XP cache. An as- 
serted MTHIT# signal from any of the snooping 
82495XP’s alerts a bus master that the data being 
accessed resides in another cache. If SNPINV was 
not asserted on the snoop request, the copy of the 
data in a 82495XP asserting MTHIT# will remain 
valid and in the Shared state— so a caching master 
must also place his copy of the data in the Shared 
state. 


7.44.2 WHEN DRIVEN 

The snoop lookup Is performed in the CLK in which 
SNPCYC# is asserted. The MTHIT# result for the 
snoop is driven on the next CLK and remains valid 
until the next assertion of SNPSTB#. The MTHIT# 
signal is not valid from SNPSTB# until the CLK after 
SNPCYC#. 


7.44.3 RELATION TO OTHER SIGNALS 

MTHIT# and MHITM# together indicate the results 
of a snoop lookup in the 82495XP. 


2-316 



ini^. 


82495XP Cache Controller/82490XP Cache RAM 




An 82495XP can accept a snoop request while per- 
forming memory bus transfers of its own. If a snoop 
is requested while it Is performing a transfer of its 
own, the results of the snoop may be delayed. If 
SNPSTB# is sampled at a 82495XP after It has re- 
ceived BGT # for its own cycle, the snoop lookup is 
performed (SNPCYC# active) after the SWEND# of 
its own cycle, and MTHIT# is driven with the valid 
result one CLK after SNPCYC# (see Sections 6.2.4 
and 6.2.5). 

Because an asserted MTHIT# from any snooping 
82495XP requires the master to place the fetched 
line in the Shared state (unless It is an invalidating 
snoop), the memory bus controller should include 
the MTHIT# signals of other processors when gen- 
erating the MWB/WT # signal to Its own 82495XP. 


7.45 MWB/WT# 

Memory Write-back/Write-through 
Forces lines to be filled to the [S] state 
Input to 82495XP (pin K3) Cycle Attribute Signal 
Synchronous to CLK 


7.45.1 SIGNAL DESCRIPTION 

MWB/WT # is an input to the 82495XP that is sam- 
pled at the closing of the snoop window (SWEND# 
activation). If sampled active, the current line-fill Is 
filled to the [S] state in the 82495XP. The [S] state 
is a write-through state in the 82495XP. 

MWB/WT# Is used in many cases. If a cache to 
cache transfer updates memory and leaves the data 
valid In the other cache, the line must be filled to the 
[S] state instead of the [E] state default. A portion of 
memory may be designated as write-through by as- 
serting MWB/WT # for appropriate addresses. 

MWB/WT# has no effect on the 82495XP if 
DRCTM# is sampled active or MRO# has been 
sampled active during KWEND#. If PWT is active, 
MWB/WT # has no effect and the line Is filled to the 
[S] state. 


7.45.2 WHEN SAMPLED 

MWB/WT # is sampled on the first cipck edge that 
SWEND# Is sampled active.' If MWB/WT# is not 
being sampled, it need not follow setup and hold 
times. 


7.46 MX4/MX8# 

MTR4/MTR8# 

Memory 4/8 I/O bits 
Memory 4/8 Transfers 

Selects MDATA Input/Output width and number of 
memory bus transfers 

Inputs to 82490XP (pins 21, 25) Configuration Sig- 
nals 

Synchronous to CLK 


7.46.1 SIGNAL DESCRIPTION 

MX4/MX8# configures the 82490XP to use 
MDATA[0:3] or MDATA[0:7] memory bus I/O pins. 
MTR4/MTR8# selects whether the a cache line will 
take 4 or 8 transfers. These selections depend on 
the line ratio (82495XP line size / CPU line size) and 
must be configured according to the following table: 


Line 

Ratio 

MX4/ 

MX8# 

MTR4/ 
MTR8 # 

Membus 
I/O Pins 

CPUbus 
I/O Pins 

1 

1 

1 

4 

4 

2 

1 

0 

4 

4 

2 

0 

1 

8 

4 

4 

0 

0 

8 

4 

1 

0 

1 

8 

8 

2 

0 

0 

8 

8 


7.46.2 WHEN SAMPLED 

These signals are sampled like Figure 7-1 with a set- 
up time of 1 clock. Once the first CADS# is Issued 
by the 82495XP these signals are sampled for the 
MZBT# and MSEL# functions. 


7.46.3 RELATION TO OTHER SIGNALS 

MX4/MX8# shares a pin with MZBT# and MTR4/ 
MTR8# shares a pin with MSEL#. 


7.47 MZBT# 

Memory Zero Base Transfer 
Forces cycles to begin at subline address 0 
Input to 82490XP (pin 21) Cycle Control Signal 
Synchronous to MCLK or Strobed 


7.45.3 RELATION TO OTHER SIGNALS 

Both MWB/WT# and DRCTM# are sampled with 
SWEND#. 
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7.47.1 SIGNAL DESCRIPTION 

MZBT# is an input to the 82490XP that forces a 
read or write cycle to begin with burst address 0 
regardless of the CPU generated address. 

MZBT# is sampled before the transfer begins. 
MZBT# is sampled with MSEL# and MEOC#. 
MZBT # is sampled with MSEL# going active for the 
current cycle. If MSEL# stays active between cy- 
cles, MZBT# Is sampled with MEOC# going active 
for the previous cycle. 

Once sampled, data input to the 82490XP’s will start 
at burst address 0 and continue through 4, 8, C, etc. 
If the CPU is requesting a burst location other than 
0, the memory bus controller must hold off any 
BRDY# until that bursted item is read from the 
memory bus. 

7.47.2 WHEN SAMPLED 

In clocked mode, MZBT # is sampled in two loca- 
tions. First, MZBT # is sampled on all MCLK rising 
edges where MSEL# is sampled inactive. Once 
MSEL# is sampled active, the value of MZBT # that 
was sampled one MCLK before is used for the next 
transfer. 

Second, MZBT # is sampled on MCLK rising edges 
where MEOC# Is sampled active with MSEL# ac- 
tive. The MZBT # value sampled will be used for the 
next transfer. This allows MSEL# to stay asserted 
between transfers if so desired. 

In strobed mode, MZBT # is sampled with the same 
two signals. First, it is sampled with the falling edge 
of MSEL#. Second, It is sampled with the falling 
edge of MEOC# if MSEL# is active. 

In clocked memory bus mode MZBT# must follow 
setup and hold times to all MCLK edges where 
MSEL# is sampled inactive or MEOC# is sampled 
active with MSEL# active. 

In strobed memory bus mode MZBT# must meet 
setup and hold times to MSEL# falling edge and 
MEOC# falling edge if MSEL# is active. 


7.47.3 RELATION TO OTHER SIGNALS 

MZBT# is sampled with MSEL# and MEOC# and 
has no effect otherwise. In systems that will never 
force a zero-based transfer, MZBT # may be driven 
high after RESET. 

MZBT# shares a pin with the MX4/MX8# configu- 
ration Input. 


7.48 NCPFLD# 

Non-Cacheable PFLD 

Enables Non-Cacheable Floating Point Loads 
Input to 82495XP (N4) Configuration Signal 
Asychronous 

7.48.1 SIGNAL DESCRIPTION 

During RESET, this pin functions as the NCPLFD# 
configuration signal. The 82495XP can be config- 
ured to decode i860 XP CPU PFLD (Pipelined Float- 
ing Point Load) cycles. The 82495XP supports 3 op- 
erational modes for PFLD cycle decoding as defined 
by FPFLDEN and NCPFLD#: 

Mode #1. PFLD cycles that are cached in the 
82495XP. 

Mode #2. PFLD cycles not cached in the 82495XP, 
without an external PFLD extension 
FIFO. 

Mode #3. PFLD cycles not cached in the 82495XP, 
with an external PFLD extension FIFO. 


Mode # 

FPFLDEN 

NCPFLD# 

1 

0 

1 

2 

0 

0 

3 

1 

1 

Illegal Mode 

1 

0 


See Section 5.2.5 for details. 


2-318 




82495XP Cache Controller/82490XP Cache RAM 




iny 


7.48.2 CASES IT IS ASSERTED AND 
DEASSERTED 

NCPFLD# is sampled on the falling edge of RESET 
and is a don’t care at any other time. NCPFLD# 
must be valid for at least 10 CLK’s before RESET’s 
falling edge. 

7.48.3 RELATION TO OTHER SIGNALS 

NCPFLD# shares a pin with FLUSH#. Both 
NCPFLD# and FPFLDEN describe the PFLD mode 
used. 


7.49 NENE# 

Next Near 

Indicates current cycle address is near previous one. 
Output from 82495XP (pin D5) Cycle Control Signal 
Synchronous to CLK 

7.49.1 SIGNAL DESCRIPTION 

NENE# indicates to the MBC that the address of 
the requested memory cycle Is “near” the address 
of the previously generated one (in the same 2K 
DRAM page). This information may be used by the 
MBC to optimize access to paged or static column 
DRAMS. 


7.49.2 WHEN DRIVEN 

NENE# is valid together with CADS# and will stay 
valid until CNA# or CRDY#. 


7.49.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/10#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 

NENE# may change state after CNA# or CRDY# 
are asserted to the 82495XP. 


7.50 PALLC# 

Potential Allocate 

Indicates 82495XP intent to allocate current cycle 
Output from 82495XP (pin D2) Cycle Control Signal 
Synchronous to CLK 


7.50.1 SIGNAL DESCRIPTION 

PALLC# Indicates to the MBC that the current write 
cycle may allocate (perform a line-fill on) a cache 
line. The MBC chooses to perform an allocation by 
asserting MKEN# during KWEND# of the write cy- 
cle. Potential allocate cycles are cycles which are 
82495XP misses with PCD and PWT Inactive. 

The exact condition for assertion of PALLC# is: 

Miss * !PCD * !PWT ^ LOCK# * W/R# * D/C# * M/IO# 

PALLC# is inactive (HIGH) for any write-hit to a 
Read-Only line. 

7.50.2 WHEN DRIVEN 

PALLC# Is valid in the same CLK as CADS# and is 
valid until CRDY# or CNA#. 


7.50.3 RELATION TO OTHER SIGNALS 

PALLC# is valid with CADS#. 


7.51 PAR# 

Parity Selection 

Selects 82490XP as a Parity Device 

Input to 82490XP (pin 32) Configuration Signal 

Synchronous to CLK 

7.51.1 SIGNAL DESCRIPTION 

PAR# Is a strapping option on the 82490XP that, 
when strapped low, configures that 82490XP device 
to be a dedicated parity device. A 82490XP parity 
device must be configured the same as all the other 
devices, however, the data lines are defined differ- 
ently. CDATA[0:3] are 4 parity bit I/O lines and 
CDATA[4:7] are 4 bit select lines so each parity line 
may be written individually. Parity devices must be 
used as follows: 


Cache 

Size 

Memory 

Bus 

Width 

Number 
of Parity 
Devices 

82490XP 

I/O Bits 
(CPU:Mem) 

256K 

64 

2 

4:4 

512K 

128 

2 

4:8 
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7.51.2 WHEN SAMPLED 

PAR# is a strapping option and must be tied either 
high or low. 

7.51.3 RELATION TO OTHER SIGNALS 

PAR# affects the definition of the CDATA and MDA- 
TA lines of the 82490XP. 


7.52 RDYSRC 

Ready Source 

Cycle control signal to the MBC 

Output from 82495XP (pin Cl) Cycle Control Signal 

Synchronous to CLK 


7.52.1 SIGNAL DESCRIPTION 

RDYSRC serves as a cycle control signal to the 
MBC. It indicates the source of the BRDY# genera- 
tion (either 82495XP or MBC) for the CPU. When 
high it indicates that the MBC should generate the 
BRDY#s to the CPU, when low it indicates that the 
82495XP will provide the BRDY#s. 

RDYSRC is asserted for line-fill and not asserted for 
the write portion of allocation cycles. 


7.52.2 WHEN DRIVEN 

RDYSRC is valid in the same CLK as CADS# and is 
valid until CRDY# or CNA#. 


7.52.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 


7.53 RESET 

Reset 

Forces the 82495XP to begin execution In a known 
state 

Input to 82495XP (Q5) 

Asynchronous 


7.53.1 SIGNAL DESCRIPTION 

The falling edge of this signal tells the 82495XP to 
sample all configuration inputs and initializes the 
82495XP to a known state. See the specific configu- 
ration signals for setup and hold times relative to 
RESET’S falling edge. RESET can be asserted at 
any time. 

During Initllalization, the 82495XP LRU bits are set 
to 1 indicating that the 82495XP LRU way is way 1. 
The 82490XP MRU bits are Initlialized to 0 as are all 
tag array bits. 

RESET takes about 4100 clocks in the 82495XP. 
RESET with self-test takes about 80,000 clocks. 


7.53.2 WHEN SAMPLED 

RESET is an asynchronous input. RESET must have 
a pulse width of at least 8 CLK’s in order to guaran- 
tee 82495XP recognition. 


7.53.3 RELATION TO OTHER SIGNALS 


The following signals are sampled at RESET: 


CNA#[CFG0]: 

CFGO line of 82495XP 
configuration Inputs 

SWEND#[CFG1]: 

CFG1 lineof 82495XP 
configuration inputs 

KWEND# [CFG2]: 

CFG2 line of82495XP 
configuration inputs 

FLUSH# [NCPFLD#]: 

If low, enables decoding of 
i860XL non- cacheable PFLD 
mode. 

FPFLD#[FPFLDEN]: 

If high, enables the external 

FIFO for I860XL PFLD mode. 

BGT# [C490LDRV]: 

Indicates the driving strength of 
the 82495XP/82490XP 

Interface. 

SYNC# [MEMLDRV]: 

Indicates the memory bus 
driving strength. 

SNPCLK# [SNPMD]: 

Indicates the snooping mode; 
synchronous or strobed. 

CFG2-CFG0 

Configure cache parameters 
such as lines/sector, line ratio, 
and number of tags. 
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7.54 SLFTST# 


7.55.3 RELATION TO OTHER SIGNALS 


Self Test 

Executes 82495XP self-test 

Input to 82495XP (pin M2) Test Signal 

Synchronous to CLK 


Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 


7.54.1 SIGNAL DESCRIPTION 


7.56 SNPADS# 


If SLFTST# is sampled low and HIGHZ# is sam- 
pled high, the 82495XP will perform a self-test after 
reset. The results of the self-tests are given by CA- 
HOLD when FSIOUT # goes inactive. 


7.54.2 WHEN SAMPLED 

SLFTST # is sampled with reset like figure 7-1 with a 
setup time of 10 CPU clocks. SLFTST# is then a 
“don’t care” until after the first CADS# activation 
when it becomes the CRDY# pin. 


7.54.3 RELATION TO OTHER SIGNALS 

SLFTST# shares a pin with CRDY#. The 82495XP 
enters self-test if both SLFTST# is sampled active 
and HIGHZ# Is sampled Inactive. 


Cache Snoop Address Strobe 


Initiates a snoop write back cycle 

Output from 82495XP (pin F3) Snooping Signal 


Sync to CLK 


7.56.1 SIGNAL DESCRIPTION 

The SNPADS# signal indicates valid cache control 
and attribute signals, functioning identically to 
CADS#, but is generated only on snoop write- 
backs. The separation of address status signals for 
normal and snoop write-back cycles eases memory 
bus controller implementation. When SNPADS# is 
activated, the memory bus controller should abort all 
pending cycles for which BGT# has not been Is- 
sued. The 82495XP reissues these non-committed 
cycles after the snoop write-back has completed. 


2 


7.55 SMLN# 

Same Line 

Current cycle is same 82495XP line as previous one. 
Output from 82495XP (pin C6) Cycle Control Signal 
Synchronous to CLK 

7.55.1 SIGNAL DESCRIPTION 

SMLN# is used to indicate to the MBC that the cur- 
rent cycle is accessing the same 82495XP cache 
line as the previous cycle. This indication can be 
used by the MBC to selectively activate its 
SNPSTB# signal to other caches in the system. For 
example, back-to-back snoop hits to the same line 
may be snooped only once. 

7.55.2 WHEN DRIVEN 

SMLN# is asserted with CADS# and will stay valid 
until CNA# or CRDY#. 


7.56.2 WHEN DRIVEN 

SNPADS# is produced when a snoop hits a modi- 
fied line. A modified line condition exists when a line 
in the cache has been updated, and copies of that 
memory location in other devices are no longer val- 
id. A snoop is initiated by the master of a shared bus 
when accessing a memory location on the shared 
bus. 

The response of the 82495XP to a snoop appears 
on the MTHIT # and MHITM# pins in the clock after 
SNPCYC# is active. If these pins are both driven 
low, the snoop resulted in a hit to a modified line, 
and a snoop write-back is initiated with the assertion 
of SNPADS#. SNPADS# is driven, at earliest, two 
clocks after SNPCYC#. Like CADS#, SNPADS# is 
active for one CLK, and Is always valid. 

7.56.3 RELATION TO OTHER SIGNALS 

Cycles initiated by SNPADS# require only CRDY#; 
they do not require the other cycle progress signals 
(BGT#, KWEND#, SWEND#). 


2-321 


82495XP Cache Controller/82490XP Cache RAM 




ini^. 


The SNPADS# signal is driven by the 82495XP to 
indicate the start of the write-back cycle; the 
82495XP drives the following address and cycle 
specification signals valid with SNPADS#: CW/R#, 
CD/C#, CM/10#, MCACHE#, RDYSRC, NENE#, 
SMLN#, and the address on MSET[0:10], 
MTAG[0:11], and MCFA[0:6]. Upon assertion of 
SNPADS#, the memory bus controller should can- 
cel all pending cycles for which BGT # has not yet 
been asserted, because they will be reissued after 
the snoop write-back. The 82495XP will Ignore 
BGT # while SNPBSY# and MHITM# are active (ie, 
during the write^back). 

The 82495XP can accept a snoop request while per- 
forming memory bus transfers of its own. If a snoop 
is requested while it is performing a transfer of its 
own, the results of the snoop and any necessary 
snoop write-backs may be delayed. If SNPSTB# Is 
sampled at a 82495XP after it has received BGT # 
for Its own cycle, and the snoop hits a modified line, 
the snoop write-back will occur after CRDY# for the 
82495XP’s own cycle. See Sections 6.2.4 and 6.2.5 
for details. 


7.57 SNPBSY# 

Snoop Busy 

Indicates additional snoop processing in progress 
Output from 82495XP (pin F1) Snooping Signal 
Sync to CLK 

7.57.1 SIGNAL DESCRIPTION 

SNPBSY# and SNPCYC# indicate a snoop in prog- 
ress. The SNPCYC# signal is asserted on the actual 
snoop look-up to the 82495XP tags. If the snoop 
look-up indicates a valid line Is hit and the snoop is 
invalidating, the 82495XP must perform a back Inval- 
idation on the CPU. If a snoop hit occurs to a modi- 
fied line, a snoop write-back must occur. SNPBSY# 
is asserted and remains active while either a back 
Invalidation or a snoop write-back is in progress. 

7.57.2 WHEN DRIVEN 

SNPBSY# is activated for two conditions. First, 
SNPBSY# is activated whenever a back invalidation 
is necessary: the snoop returns MTHIT # active and 
SNPINV was asserted on the snoop initiation. Sec- 
ond, SNPBSY# is activated when a modified cache 
line is hit on a snoop, as indicated by MHITM#, until 
the modified line has been written back (CRDY # re- 
turned for the write-back). 

SNPBSY# is valid in the CLK following SNPCYC#, 
and if active, remains active for a minimum of two 
CLKS. 


7.57.3 RELATION TO OTHER SIGNALS 

After SNPCYC# occurs for a snoop, a new snoop 
may be initiated. If SNPBSY# is asserted for the 
Initial snoop, the SNPCYC# of the second snoop is 
delayed until the SNPBSY# signal is deasserted for 
the initial snoop, indicating that its snoop processing 
has completed. 


7.58 SNPCLK [SNPMD] 

Snoop Clock [Snooping Mode] 

Selects 82495XP snooping mode 
Input to 82495XP (pin S3) Snooping Signal 
Synchronous to CLK 

7.58.1 SIGNAL DESCRIPTION 

SNPMD selects whether the 82495XP snoop initia- 
tion be in synchronous, clocked, or strobed mode. 
82495XP snoop response is always synchronous to 
CLK. 

Synchronous mode (to CLK) is selected by SNPMD 
sampled low during reset. Strobed mode is selcted 
by SNPMD sampled high during reset. Clocked 
mode is selected by connecting the snoop clock 
source to SNPMD, and thus SNPMD becomes the 
actual snoop clock (SNPCLK). 

7.58.2 WHEN SAMPLED 

SNPMD is sampled like figure 7-1 with a setup time 
of 4 CPU clocks. SNPMD Is then not used unless 
clocked mode is being selected. If clocked mode is 
selected, SNPMD becomes SNPCLK to clock in 
snoop requests. 

7.58.3 RELATION TO OTHER SIGNALS 

SNPMD becomes SNPCLK if a clock signal is de- 
tected at reset. In this clocked mode, SNPCLK is 
then used to clock-in SNPSTB#, the snoop ad- 
dress, and all snoop attributes. 


7.59 SNPCYC# 

Snoop Cycle 

Indicates snoop look-up occurring in 82495XP tags 
Output from 82495XP (pin H3) Snooping Signal 
Sync to CLK 
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7.59.1 SIGNAL DESCRIPTION 


7.60.2 WHEN SAMPLED 


SNPCYC# is asserted by the 82495XP during the 
clock when the actual tag look-up for the snoop Is 
performed. SNPCYC# may appear as early as the 
CLK following SNPSTB# assertion, or may be de- 
layed several clocks while a snoop write-back or 
82495XP memory bus cycle take place. 

7.59.2 WHEN DRIVEN 

SNPCYC# is always a valid 82495XP output. It is 
asserted once, for a single’ clock, for every snoop 
which is Initiated in the 82495XP. 


When a bus master performs a bus access, the 
SNPSTB# of all other 82495XPs is asserted to initi- 
ate a snoop for that address. If the master’s access 
is one which is modifying the data (a write to memo- 
ry, etc.), the SNPINV pin of all snooping 82495XPs 
must be asserted during SNPSTB# so that the line 
is properly marked Invalid. 

SNPINV is not asserted during SNPSTB# assertion 
If snoop hits are to remain valid: the master issuing 
the snoop does not require their invalidation (a 
read). 


7.59.3 RELATION TO OTHER SIGNALS 

A snoop is initiated by assertion of the SNPSTB# 
input if MAOE# is not asserted. The actual snoop, 
signalled by the assertion of SNPCYC#, can be de- 
layed by a prior snoop’s write-back in progress 
(SNPBSY# asserted) or by a 82495XP memory cy- 
cle In progress (SNPSTB# occurs after BGT#)~ 
see SNPSTB# for details. If neither of these is oc- 
curring, strobed and clocked snooping modes can 
also delay snoop look-up for a clock while the snoop 
address and attributes are synchronized. 

In the clock following SNPCYC#, MHITM# and 
MTHIT # report valid snoop results. 


7.60 SNPINV 

Snoop Invalidation 

Forces invalidation of snoop hits 

Input to 82495XP (pin P5) Snooping Signal 

Sampled with SNPSTB# (see SNPSTB#) 

7.60.1 SIGNAL DESCRIPTION 

Assertion of the SNPINV signal during the initiation 
of a snoop request forces a snoop hit for that re- 
quest into the. Invalid state. 


SNPINV assertion forces all snoop hits to be invali- 
dated, overriding other Inputs or attributes (ie 
SNPNCA). When SNPINV is not asserted, cache 
states change according to normal protocol. 



SNPINV is only sampled with SNPSTB#, which may 
be qualified by CLK or SNPCLK depending on the 
snooping mode, and must meet setup and hold 
times for the edge of its sampling. When SNPSTB# 
is not being asserted, SNPINV Is a don’t care and 
need not follow setup and hold times. 


7.60.3 RELATION TO OTHER SIGNALS 

SNPINV is sampled according to SNPSTB#, which 
may be qualified by SNPCLK or CLK, depending on 
the snooping mode. SNPINV overrides the SNPNCA 
input, which may also be asserted with SNPSTB#. If 
MAOE# Is active with SNPSTB# sampling, the 
snoop request is Ignored. 


7.61 SNPNCA 

Snoop Non Caching device Access 

Indicates to snooping 82495XP that the initiating 
master is a non- caching device 

Input to 82495XP (pin Q3) Snooping Signal 
Sampled with SNPSTB# (see SNPSTB#) 


The SNPINV pin is sampled upon initiation of a 
snoop request with SNPSTB# activation, depending 
on snooping mode: rising edge of first CLK when 
SNPSTB is asserted (synchronous snooping mode), 
or rising edge of first SNPCLK when SNPSTB# is 
asserted (clocked mode), or falling edge of strobed 
SNPSTB# (strobed mode). 


7.61.1 SIGNAL DESCRIPTION 

SNPNCA indicates that the master which is initiating 
the snoop request will not cache the data. If the 
SNPNCA pin is not asserted and the snoop is nonin- 
validating (where noninvalidating = SNPINV not as- 
serted), a snoop hit line must be placed In the 
Shared state, since the data will exist in another 
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cache. If SNPNCA is asserted and the snoop is non- 
invalidating, a snoop hit line will not be entered into a 
new cache, so a hit Exclusive or Modified line will be 
placed In the Exclusive state by the 82495XP. A 
noninvalidating snoop hit to a Shared line must keep 
the hit line in the Shared state, regardless of 
SNPNCA. 

SNPNCA is sampled upon initiation of a snoop re- 
quest with SNPSTB# activation, depending on the 
snooping mode: rising edge of first CLK when 
SNPSTB# asserted (synchronous snooping mode), 
or the rising edge of SNPCLK when SNPSTB# is 
asserted (clocked snooping mode), or the falling 
edge of SNPSTB# (strobed snooping mode). 

7.61.2 WHEN SAMPLED 

To achieve maximum processor performance and 
minimum bus traffic, SNPNCA should be asserted 
when the noninvalidating snoop is caused by an ac- 
cess from a non-caching device like a DMA. 

If the snoop is being caused by a device which will 
also be caching the data, SNPNCA must not be as- 
serted, so that the 82495XP does not leave the hit 
line in an Exclusive state — subsequent writes to 
lines in this state do not appear on the bus, and stale 
data would result in the cache which incorrectly as- 
serted SNPNCA. 

If SNPNCA is asserted on a noninvalidating snoop 
request, the following outlines the behavior of the 
cache for a snoop hit in each of the MESI states: 

Modified The data is written to the bus, and the 
line Is placed in the Exclusive state 

Exclusive The line remains in the Exclusive state 
Shared The line remains in the Shared state 

Invalid This is a cache miss. The line remains 
Invalid. 

If SNPNCA is NOT asserted on a noninvalidating 
snoop request, an M, E, or S state hit line will be 
placed in the Shared state. Again, M state causes a 
write to the bus. Invalid lines remain Invalid. 

SNPNCA is only sampled with SNPSTB#, which 
may be qualified by CLK or SNPCLK depending on 
the snooping mode, and must meet setup and hold 
times for the edge of this sampling. When 
SNPSTB# is not being sampled, SNPNCA is a don’t 
care and need not follow set-up and hold times. 

7.61.3 RELATION TO OTHER SIGNALS 

SNPNCA is sampled with SNPSTB#, which may be 
qualified by SNPCLK or CLK, depending on snoop- 
ing mode. The assertion of SNPINV overrides 


SNPNCA, and places all snoop hit lines into the In- 
valid state. If MAOE# is active on SNPSTB# sam- 
pling, the snoop request is ignored. 


7.62 SNPSTB# 

Snoop Strobe 

Initiates 82495XP snoop and latches snoop address 
& attributes 

Input to 82495XP (pin R3) Snooping Signal 
Sync to CLK or SNPCLK, or strobed 

7.62.1 SIGNAL DESCRIPTION 

Snoop strobe initiates a 82495XP snoop request. It 
controls the latching of the snoop address and 
snoop attribute signals, in the manner specified by 
one of three snooping modes: 


Snooping Modes 

Mode 

Snoop Address/ 
Attributes Sampled on; 

Strobed 

falling edge of SNPSTB # 

Clocked 

rising edge of SNPCLK when 
SNPSTB# sampled active 

Synchronous 

rising edge of CLK when 
SNPSTB# sampled 


SNPSTB# must be asserted to initiate a snoop re- 
quest. Snoops are initiated by a bus master for ail 
memory accesses, to ensure that data residing in 
other caches Is flushed if modified and invalidated if 
necessary. 

SNPSTB# must be deasserted for at least one 
SNPCLK or CLK when clocked or synchronous 
snooping mode (respectively) is used, in order to 
rearm for the next snoop. 

SNPSTB# can be asserted while a snoop is in prog- 
ress, allowing one level of pipelining. However, the 
reassertlon of SNPSTB# while snooping is in prog- 
ress must not occur until after SNPCYC# — precise- 
ly, after the falling edge of SNPCYC# for strobed 
and clocked modes, or in the clock after SNPCYC# 
is active for synchronous mode. SNPSTB# must not 
be asserted between the first and last BGT # of a 
locked sequence. Similarly, SNPSTB# must not oc- 
cur after the BGT # of the write through and before 
the BGT # of the allocation when a Read-for-Owner- 
ship transaction is occurring. 

SNPSTB# itself does not affect the cache contents 
or states, but the snoop signals SNPINV and 
SNPNCA, latched upon SNPSTB#, force various 
changes in the cache on a snoop hit. 
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7.62.2 WHEN SAMPLED 

SNPSTB# is sampled on every SNPCLK or CLK in 
clocked or synchronous modes, and is sampled con- 
stantly in strobed mode. While a snoop is in prog- 
ress, a new SNPSTB# is recognized as a new, pos- 
sibly pipelined, snoop request. After the assertion of 
a pipelined SNPSTB#, the SNPSTB# signal must 
not be reasserted until after the next SNPCYC#. 

SNPSTB# should always meet proper set-up and 
hold times when operating in clocked or synchro- 
nous modes. When operating in strobed mode, It 
must meet minimum active/ inactive times to be 
properly recognized in the next clock. 

7.62.3 RELATION TO OTHER SIGNALS 

SNPSTB# latches the following signals: SNPINV, 
SNPNCA, MBAOE#, and MADE#, and the address 
on the MSET, MTAG, and MCFA pins. The address 
which appears on the MSET, MTAG, and MCFA ad- 
dress pins is to be snooped in the 82495XP. 
MAOE# acts as a qualifier for a snoop; if MAOE# is 
active when sampled on a SNPSTB# assertion, the 
snoop request is ignored. SNPINV and SNPNCA 
provide the 82495XP with snoop attributes which af- 
fect the state of a snoop hit cache entry. 

If MBAOE# is active during SNPSTB# assertion, 
the 82495XP forces all bits in the subline address 
(those address bits which MBAOE# controls) to 0 
on a snoop write back for that snoop. 

Snoops and memory accesses are interlocked, such 
that after BGT # for a memory access has been is- 
sued, a SNPSTB# which Is asserted will be latched, 
with its address and attributes, but will not cause a 
snoop until after SWEND# for that memory cycle. 
After BGT# has been issued for a cycle, snoop 
write-backs are delayed until after the CRDY# for 
that cycle. Likewise, once a snoop Is underway 
(SNPCYC# active) BGT# is ignored until snoop 
completion. 

SNPSTB# must not be deasserted and reasserted 
(specifically, cause a second falling edge) between 
its initial recognition and SNPCYC#— ie, SNPSTB# 
must not be asserted before the SNPCYC# of the 
previous SNPSTB#. In strobed and clocked modes, 
SNPSTB# can be reasserted after the falling edge 
of SNPCYC#; In synchronous mode, SNPSTB# can 
be reasserted in the CLK after SNPCYC# Is active. 
This second assertion of SNPSTB#, after 
SNPCYC#, can occur while the first snoop is still 
progressing (SNPBSY# is active), allowing one level 
of snoop pipelining. In this case, a third assertion of 
SNPSTB# must not occur until after the SNPCYC# 
for the second, piped snoop request. 


SNPSTB# must not be asserted while the 82495XP 
is executing a locked sequence (LOCK# active). 
Specifically, SNPSTB# must not be asserted after 
the BGT # for the first locked access and before the 
BGT # of the last locked access. 

Systems which support Read-for-Ownership must 
not assert SNPSTB# between the BGT# of the 
write through and the BGT # of the allocation during 
a Read-for-OwnershIp operation. 


7.63 SWEND# 

Snoop Window End 

Closes Snooping Window 

Input to 82495XP (pin Q1) Cycle Progress Signal 

Synchronous to CLK 

7.63.1 SIGNAL DESCRIPTION 

SWEND# is an Input to the 82495XP that, when 
asserted, closes the snooping window and causes 
sampling of MWB/WT# and DRCTM#. Once 
snooping of all other 82495XP’s is complete, 
DRCTM# and MWB/WT# can be determined. 

Snoop response is blocked by the 82495XP be- 
tween BGT# and SWEND# activation. Therefore, 
the faster SWEND# Is closed, faster snoops can be 
determined. 

All CPU-generated write cycles and cache read miss 
cycles must cause a snoop on the memory bus. 
SWEND# may be activated once snooping has 
completed for these cycles. SWEND# activation 
causes the 82495XP’s internal tags to change state 
for the current cycle (If necessary). DRCTM# and 
MWB/WT # Influence the state change decision. 

SWEND# need only be activated for those cycles 
which require the sampling of DRCTM# and 
MWB/WT#. 

If a cycle does not specifically require SWEND#, 
and SWEND# is not returned, snooping is blocked 
from BGT# to CRDY#. For this reason, it may be 
more efficient to always return SWEND#. 


7.63.2 WHEN SAMPLED 

SWEND# Is sampled by the 82495XP on the clock 
or after KWEND# is sampled active for those cycles 
that sample KWEND#. For cycles that do not sam- 
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pie KWEND#, SWEND# is sampled with or after 
BGT#. Once SWEND# is sampled active, it is ig- 
nored until KWEND# of the next cycle. If SWEND# 
is not being sampled, it may violate setup and hold 
times. 

Snoop response is blocked between BGT# and 
SWEND#. If a snoop is initiated between BGT# 
and SWEND#, the MTHIT# and MHITM# re- 
sponse is given after SWEND# activation. Any sub- 
sequent snoop write back would begin after 
CRDY#. 


7.63.3 RELATION TO OTHER SIGNALS 

SWEND# causes the sampling of MWB/WT# and 
DRCTM#. SWEND# is sampled once KWEND# is 
sampled active. BGT#, KWEND#, and SWEND# 
may be asserted in the same clock. 

SWEND# shares a pin with CFG1. 


7.64 SYNC# 

Sync 

Synchronizes 82495XP TAG array with Main Memo- 

ry 

Input to 82495XP (Q4) Cache Synchronization Sig- 
nal 

Asynchronous 

7.64.1 SIGNAL DESCRIPTION 

SYNC# activation will cause the synchronization of 
the 82495XP and i860 XP CPU tag arrays with main 
memory. The 82495XP will flush all modified entries 
to memory. All valid tag entries will be kept, with 
modified [M] state lines becoming non-modified [E] 
state lines. 


7.64.2 WHEN SAMPLED 

SYNC# can be asserted at any time. The 82495XP 
will complete all outstanding cycles on the CPU and 
memory bus before beginning the SYNC process. 
The memory bus controller does not have to prevent 
SYNC# during locked cycles because the 82495XP 
will complete its locked cycle before the SYNC pro- 
cess will begin. 

Once a SYNC operation has begun, the SYNC# sig- 
nal Is ignored until the operation completes. If 
RESET or FLUSH# is asserted while the SYNC op- 
eration is in progress, the SYNC operation will be 
aborted and the RESET or FLUSH Immediately exe- 
cuted. 


SYNC# is an asynchronous input. SYNC# must 
have a pulse width of 2 CLK’s in order to guarantee 
82495XP recognition. 

7.64.3 RELATION TO OTHER SIGNALS 

To initiate a SYNC, the 82495XP will complete all 
pending cycles and prohibit further ADS#’s to occur 
while a SYNC is in progress. The FSIOUT # output 
signal is used to indicate the start and end of the 
SYNC operation. It will become active when the 
SYNC# signal is internally recognized (all outstand- 
ing cycles have completed) and will de-activate 
when the SYNC operation has completed. 

The memory bus controller supplies BRDY# to the 
CPU once the SYNC has completed. Once SYNC 
has begun, and FSIOUT# active, all CADS#’s and 
CRDY#’s correspond to the write-backs caused by 
the SYNC operation. 

The 82495XP can be snooped during SYNC cycles 
and the snooping protocols will be the same as that 
for any memory bus cycle. 


7.65 TCK 

Test Clock 

Clock for the JTAG boundary scan tests 
Input to the i860 XP CPU (pin 01) Test Signal 
Input to the 82495XP (pin P3) 

Input to the 82490XP (pin 3) 

Synchronous 

7.65.1 SIGNAL DESCRIPTION 

TCK Is an Input to the i860 XP CPU, 82495XP and 
82490XP and provides the clocking function re- 
quired by the JTAG boundary scan feature. TCK is 
used to clock state Information and data into and out 
of the component. State select information and data 
are clocked into the component on the rising edge 
of TCK on TMS and TDI, respectively. Data is 
clocked out of the part on the falling edge of TCK on 
TDO. 

In addition to using TCK as a free running clock, it 
may be stopped in a low, logic 0, state, indefinitely 
as described in IEEE 1149.1. While TCK is stopped 
in the low state, the boundary scan latches retain 
their state. 

When boundary scan is not used, TCK should be 
tied low. 
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7.65.2 WHEN SAMPLED 

TCK is a clock signal and is used as a reference for 
sampling other JTAG signals. 

7.65.3 RELATION TO OTHER SIGNALS 

On the rising edge of TCK, TMS and TDI are sam- 
pled. On the falling edge of TCK, RDO is driven. 


7,66 TDI 

Test Data Input 

Receives serial test instructions and data 
Input to the I860 XP CPU (pin S14) Test Signal 
Input to the 82495XP (pin N3) 

Input to the 82490XP (pin 2) 

Synchronous to TCK 

7.66.1 SIGNAL DESCRIPTION 

TDI is the serial input used to shift JTAG instructions 
and data Into the component. The shifting of Instruc- 
tions and data occurs during the SHIFT-IR and 
SHIFT- DR TAP controller states, respectively. 
These states are selected using the TMS signal as 
described in chapter 9. 

An internal pull up resistor is provided on TDI to en- 
sure a known logic state if an open circuit occurs on 
the TDI path. Note than when “1” Is continuously 
shifted into the Instruction register, the BYPASS in- 
struction is selected. 


7.66.2 WHEN SAMPLED 

TDI Is sampled on the rising edge of TCK, during the 
SHIFT-IR and the SHIFT-DR states. During all other 
TAP controller states, TDI is a “don’t care’’. 


7.66.3 RELATION TO OTHER SIGNALS 

TDI is only sampled when TMS and TCK have been 
used to select the SHIFT-IR or SHIFT-DR states In 
the TAP controller. 

For proper initialization of JTAG logic, TDI should be 
driven high, “1’’, for at least four TCK cycles follow- 
ing the rising edge of RESET. 


7.67 TDO 

Test Data Output 

Outputs serial test instructions and data 

Output from the i860 XP CPU (pin R10) Test Signal 

Output from the 82495XP (pin C4) 

Output from the 82490XP (pin 84) 

Synchronous to TCK 

7.67.1 SIGNAL DESCRIPTION 

TDO Is the serial output used to shift JTAG instruc- 
tions and data out of the component. The shifting of 
instructions and data occurs during the SHIFT-IR 
and SHIFT- DR TAP controller states, respectively. 
These states are selected using the TMS signal as 
described in chapter 9. 

When not in SHIFT-IR or SHIFT-DR state, TDO is 
driven to a high impedance state to allow connecting 
TDO of different devices in parallel. 

7.67.2 

TDO is driven on the falling edge of TCK during the 
SHIFT-IR and SHIFT- DR TAP controller states. At 
all other times TDO is driven to the high impedance 
state. 


7.67.3 

TDO is only driven when TMS and TCK have been 
used to select the SHIFT- I R or SHIFT-DR states in 
the TAP controller. 

7.68 TMS 

Test Mode Select 

Controls testing by selecting mode of operation 
Input to the i860 XP CPU Test Signal 
Input to the 82495XP (pin P2) 

Input to the 82490XP (pin 1) 

Synchronous to TCK 

7.68.1 SIGNAL DESCRIPTION 

TMS is decoded by the JTAG TAP (Tap Access 
Port) to select the operation of the test logic, as de- 
scribed in chapter 9. 
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To guarantee deterministic behavior of the TAP con- 
troller TMS is provided with an internal pull-up resis- 
tor. If boundary scan is not used, TMS may be tied 
high or left unconnected. 

7.68.2 WHEN SAMPLED 

TMS is sampled on every rising edge of TCK. 

7.68.3 RELATION TO OTHER SIGNALS 

TMS is used to select the Internal TAP states re- 
quired to load boundary scan instructions to data on 
TDI. 

For proper initialization of the JTAG logic, TMS 
should be driven high, “1”, for at least four TCK cy- 
cles following the rising edge of RESET. 


7.69 Vcc and Vss 

Power and Ground Pins 

See Tables 1.1 and 1.2 for locations. 


7.70 WWOR# 

Weak Write Ordering Mode 
Enforces strong/weak write-ordering policy 
Input to 82495XP (pin Q2) Configuration Signal 
Synchronous to CLK 

7.70.1 SIGNAL DESCRIPTION 

When asserted during reset, the 82495XP enforces 
a weak write ordering policy. If WWOR # is deassert- 
ed during reset, the 82495XP enforces a strong 
write-ordering policy. 

In a strong write-ordering mode, writes to the memo- 
ry bus are forced to occur in the order in which they 
were posted by the CPU. In a weak write-ordering 
mode It is possible for: 

1. A CPU posted write (A) to be waiting in a 
82495XP/82490XP memory buffer. 

2. A subsequent CPU write (B) to complete in the 
82495XP/82490XP because it was a hit to M or E 
state. 


3. A snoop hit to B to cause a write back of B before 
A is written. 

In this scenario, B is written to rriemory before A is, 
and thus CPU writes have been reordered. 


7.70.2 WHEN SAMPLED 

WWOR# is sampled during reset like figure 7-1 with 
a setup time of 4 CPU clocks. WWOR# becomes 
MALE once FSIOUT # Indicates that the 82495XP 
reset sequence has completed. 

7.70.3 RELATION TO OTHER SIGNALS 

WWOR # shares a pin with MALE. 


8.0 BUS FUNCTIONAL DESCRIPTION 
AND TIMING 

The 82495XP/82490XP cache core supports a wide 
variety of bus transfers to meet the needs of high 
performance systems. Bus transfers can be single 
cycle or multiple cycle, cacheable or non-cacheable, 
64- or 128-blt (memory bus), and locked. To support 
multiprocessing systems there are cache back-inval- 
idation, inquire, snooping, read for ownership, cache 
to cache transfers, and locked cycles. 

This section begins with read cycles, both cacheable 
and non-cacheable. It moves on to write cycles, 
cacheable and non-cacheable. Snooping cycles are 
discussed next with an example of each snooping 
mode. The remaining sections describe special cy- 
cles: read for ownership, I/O, and locked cycles. 

The cycles shown in this chapter are examples of 
various types of 82495XP/82490XP cycles. The pur- 
pose of these examples is to show signal relation- 
ships, and are not necessarily best case scenarios. 


8.1 Read Cycles 

8.1.1 READ HITS 

Read Hit cycles are executed completely within the 
CPU/Cache core, and will not be seen by the MBC. 
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8.1.2 CACHEABLE READ MISSES 

8. 1.2.1 Read Miss with Clean Replacement 

Figure 8.1 illustrates CPU initiated Read cycles that 
miss the 82495XP/82490XP cache and replace a 
non-dirty (eg. clean or empty) line in the cache. In 
such cycles, the 82495XP will instruct the MBC to 
perform a cache line-fill cycle on the memory bus. A 
cache line-fill is a read of a complete 
82495XP/82490XP line from main memory. The line 
is then written into the 82490XP’s array, and data 
transferred to the CPU as requested. If the line 
fetched from main memory replaces a 
82495XP/82490XP cache line which is in valid un- 
modified state ([E] or [S]), then a back-invalidation 
cycle is performed on the CPU bus to guarantee that 
the replaced data is also removed from the CPU’s 
first level cache, thus maintaining the inclusion prop- 
erty. 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 2) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss is potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 2 and 1 3 for the two cycles in this example) 
and remain valid until after CNA# Is sampled active 
by the 82495XP (clocks 5 and 16). MALE and MBA- 
LE may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 3), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT # asserted. It must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit in the cache. 

CNA# is asserted by the MBC (clock 4) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 


at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# during 
KWEND# (clock 5) to determine that the cycle is 
Indeed cacheable. 

The MBC asserts SWEND# when the snoop win- 
dow ends on the memory bus. The 82495XP sam- 
ples MWB/WT# and DRCTM# during SWEND# 
(clock 7) and updates the cache tag state according 
to the consistency protocol. The closure of the 
snoop window also enables the MBC to start provid- 
ing the CPU with data that has been stored in the 
82490XP’s memory cycle buffer. The MBC supplies 
BRDY#s to the CPU (clocks 7-10). 

The first cycle ends when CRDY# is driven active 
by the MBC (clock 10). It is at this time that the data 
in the 82490XP’s memory cycle buffers is loaded 
into the cache SRAM. 

The 82495XP issues a new CADS# in clock 13, 
which also misses the 82495XP/82490XP cache. 
Note that once the cycle progress signals (BGT#, 
CNA#, KWEND#, SWEND#) of a cycle are sam- 
pled asserted, the 82495XP ignores them until the 
CRDY# of that cycle. The 82495XP does not pipe- 
line the cycle progress signals (ie. it will not sample 
them again until after CRDY# of the current memory 
bus cycle). 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. MDOE# must be 
inactive to allow the data pins to be used as inputs. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP cache 
SRAM. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK is used for the next transfer. 
MBRDY# is driven active by the MBC in clocks 4 to 
6 to cause the memory burst counter to be Incre- 
mented and data to be placed into the 82490XP 
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cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 7) to end the current cycle 
on the memory bus and switch memory cycle buffers 
for the new cycle. MZBT # is latched at this time 
(when MEOC# is sampled asserted and MSEL# re- 
mains low) for the next transfer. 

MBRDY# is driven active by the MBC in clocks 15 
to 1 7 to read data into the 82490XP cache memory 
cycle buffers. The MBC asserts MEOC# (clock 18) 
to end the second read miss cycle on the memory 
bus and switch the memory cycle buffers for a new 
cycle. 

For Strobed Memory Bus Mode, MSEL# Is driven 
active by the MBC (clock 4) to allow MISTB opera- 
tion and to latch MZBT# (on the falling edge of 
MSEL#) for the transfer. MISTB is toggled In clocks 
5 to 7 to cause the memory burst counter to be in- 
cremented, and data to be placed Into the 82490XP 
cache memory cycle buffers. Note: MISTB latches 
the memory bus data on both the rising and falling 
edges. The MBC drives MEOC# asserted (clock 8) 
to end the current cycle on the memory bus and 
switch memory cycle buffers for the new cycle. 
MZBT # for the next cycle. Is sampled at this time on 
the falling edge of MEOC#. 

MISTB is toggled by the MBC (clocks 15 to 17) to 
read data into the 82490XP memory cycle buffers. 
The MBC asserts MEOC# (clock 1 8) to end the sec- 
ond read miss cycle on the memory bus and switch 
the memory cycle buffers for a new cycle. 

8. 1.2.2 Read Miss with Replacement of Dirty 
Line 

Figure 8.2 illustrates a CPU read cycle which misses 
the 82495XP cache, and requires the replacement 
of a modified line (eg. tag replacement, lines/ 
sector =1 line ratio = 1). In such cycles, the 
82495XP will instruct the MBC to perform a cache 
line-fill on the memory bus, instruct the 82490XP to 
fill its write-back buffer with the contents of the array 
location corresponding to the line which must be re- 
placed, and perform a back Invalidation to the CPU 
to maintain the first and second level cache consist- 
ency. Once the cache line-fill has completed, the 
82495XP/82490XP will write back the contents of 
the replaced line to main memory from the 82490XP 
write-back buffer. 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1 ) and the associated cycle control signals to 


the MBC (eg. CW/R#, CM/10#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss is potentially cacheable by the 82495XP; 
RDYSRC Is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 5 for the two cycles in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 10). MALE and MBA- 
LE may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 2), indicating that the cycle is 
guaranteed to complete on the memory bus. At this 
point, the 82490XP’s write-back buffer is prefilled 
with the line to be replaced. Once the 82495XP sam- 
ples BGT # asserted, it must finish that cycle on the 
memory bus. Prior to this point, the cycle can be 
aborted by a snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheablllty at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 
at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# during 
KWEND# (clock 4) to determine that the cycle is 
indeed cacheable. 

The MBC asserts SWEND# (clock 6) when the 
snoop window ends on the memory bus. The clo- 
sure of the snoop window enables the MBC to start 
providing the CPU with data that has been stored in 
the 82490XP’s memory cycle buffer. The MBC sup- 
plies BRDY#s to the CPU (clocks 6-9) to serve the 
read cycle. Note that data may be supplied to the 
82490XP’s immediately after MSEL# activation, and 
need not wait for SWEND#. 

On the memory bus, the 82495XP Issues a write- 
back (WB) cycle. CNA# is sampled active In clock 3 
causing the 82495XP to Issue the CADS# (also 
CDTS#) of the write-back (clock 5). The MBC 
knows this Is a write back cycle and not a CPU initia- 
ted write cycle by sampling MCACHE# asserted. 
This tells the MBC how many data transfers are nec- 
essary. 

BGT #, CNA#, and KWEND# of the write-back are 
sampled asserted by the MBC (clock 9) after the 
CRDY# of the read miss cycle (clock 8). At this 
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Figure 8-2. Cacheable Read Miss with Replacement of Dirty Line 
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point, the 82495XP may issue another CADS# for a 
new (unrelated) memory bus cycle. It is at this time 
that the data in the 82490XP’s memory cycle buffers 
is loaded into the cache SRAM. The data to be writ- 
ten back to main memory is in the 82490XP’s write 
back buffers. 

The snoop window for the write back cycle is closed 
by the MBC in clock 1 1 , and the cycle Is ended by 
CRDY# sampled asserted in clock 13. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP cache 
SRAM. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 3) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK Is used for the next transfer. 
MBRDY# is driven active by the MBC in clocks 3 to 
5 to cause the memory burst counter to be incre- 
mented and data to be placed Into the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 6) to end the current cycle 
on the memory bus and switch memory cycle buffers 
for the new cycle. MZBT# Is latched at this time 
(when MEOC# is sampled asserted) for the next 
transfer. 

The MBC asserts the memory data output enable 
signal (MDOE#, clock 8) to drive the memory data 
outputs. 

MBRDY# Is driven active by the MBC in clocks 10 
to 1 2 to write data from the 82490XP cache memory 
cycle buffers onto the memory bus. The MBC as- 
serts MEOC# (clock13) to end the write back cycle 
on the memory bus and switch the memory cycle 
buffers for a new cycle. 

For Strobed Memory Bus Mode, MSEL# Is driven 
active by the MBC (clock 4) to allow MISTB opera- 
tion and to latch MZBT# for the transfer (on 
MSEL# falling edge). MISTB is toggled In clocks 5 
to 7 to cause the memory burst counter to be incre- 


mented, and data to be placed into the 82490XP 
cachememory cycle buffers. Note: MISTB latches 
the memory bus data on both the rising and falling 
edges. The MBC drives MEOC# asserted (clock 8) 
to end the current cycle on the memory bus and 
switch memory cycle buffers for the new cycle. 
MZBT # for the next cycle, is latched at this time on 
the falling edge of MEOC#. 

The MBC asserts MDOE# (clock 9) to drive the 
memory data outputs. 

MOSTB is toggled by the MBC (clocks 10 to 12) to 
write data from the 82490XP memory cycle buffers 
onto the memory bus. The MBC asserts MEOC# 
(clock 1 3) to end the write back cycle on the memo- 
ry bus and switch the memory cycle buffers for a 
new cycle. 


8.1.3 NON-CACHEABLE READ MISSES 

8.1.3.1 Read Misses not Cacheable by CPU/ 

Cache Core and Cacheable by Core, but 
not by Memory Bus 

Figure 8.3 illustrates two CPU read cycles which 
miss the 82495XP cache, and are non-cacheable. In 
the first cycle, the CPU/Cache core forces the read 
to be non-cacheable (as indicated by the 
MCACHE# output from the 82495XP). In the sec- 
ond cycle, non-cacheability of the data is forced by 
the memory bus (as Indicated by the MKEN# input 
from the MBC). Since both cycles are not cache- 
able, there is no line-fill operation performed, the cy- 
cles are merely echoed to the memory bus. 

CACHE CONTROL SIGNALS: 

The CPU Initiates the first read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues a cycle re- 
quest (CADS# in clock 1) and the associated cycle 
control signals to the MBC (eg. CW/R#, CM/10 #> 
CD/C#, RDYSRC, MCACHE#) In order to schedule 
the read operation. RDYSRC is active, indicating 
that the MBC must provide BRDY# to the CPU; 
MCACHE# Is not active, indicating that the read 
miss in not cacheable by the CPU/Cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) Is valid with CADS# 
(clocks 1 and 5 for the two cycles in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 10). MALE and MBA- 
LE may be used to hold the address as necessary. 
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The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 2), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT # asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clocks) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

This cycle has already been determined to be non- 
cacheable; therefore. The MBC does not need to 
assert SWEND#, KWEND#, or MKEN# to the 
82495XP/82490XP cache. The MBC supplies 
BRDY# to the CPU to complete the cycle to the 
CPU. The MBC asserts CRDY (clock 8) to the 
82495XP/82490XP to complete the read miss cycle 
on the memory bus. 

The 82495XP issues a new (unrelated) cycle request 
(CADS# in clock 5) which also misses the 
82495XP/82490XP cache. Since the 82495XP has 
already sampled CNA# asserted, it issues a new 
CADS# prior to receiving CRDY# of the current cy- 
cle (ie. this cycle is pipelined within the MBC). Note 
that once the cycle progress signals of a cycle are 
sampled asserted, the 82495XP ignores them until 
the CRDY# of that cycle. The 82495XP will not 
sample the cycle progress signals again until after 
the CRDY# of the current memory bus cycle. The 
current read cycle is completed on the bus in clock 8 
with CRDY# assertion. 

The cycle progress signals for the second read miss 
are also valid at this time (clock 5). RDYSRC is ac- 
tive, indicating that the MBC must provide BRDY#s 
to the CPU/Cache core; and MCACHE# is active, 
indicating that the read miss is potentially cacheable 
by the 82495XP/82490XP. 

The MBC issues BGT# and CNA# to the 82495XP 
in clock 9 to indicate that the cycle Is guaranteed to 
complete on the memory bus, and that It is ready to 
schedule a new memory bus cycle. KWEND# is as- 
serted at this time to close the cacheability window. 
MKEN# is not active, indicating to the 82495XP that 
the read miss cycle is not cacheable by the memory 
bus. KWEND# and MKEN# must be returned to the 
82495XP at least two clocks prior to BRDY# to in- 
form the CPU that a line fill will not follow. 

The MBC asserts SWEND# (clock 11) to close the 
snoop window, and CRDY# (clock 13) to complete 


the cycle to the 82495XP/82490XP. Note: 
SWEND# is not needed since the cycle was not 
cacheable. 


NOTE: 

Both examples show single cycle read requests. 
MEMORY BUS SIGNALS: 


The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches In flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. The memory data 
output enable (MDOE#) must be inactive to allow 
the data pins to be used as inputs. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP memory cy- 
cle buffers. 



For Clocked Memory Bus Mode, MEOC# is assert- 
ed by the MBC (clock 6) to latch MZBT # for the 
next transfer, and end the current cycle on the mem- 
ory bus (MBRDY# and MSEL# are not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the read cycle to begin with a non-zero burst ad- 
dress. 


For the second non-cacheable read cycle, MSEL# 
is driven active by the MBC (clock 8) to allow sam- 
pling of MBRDY# and to latch MZBT # for the trans- 
fer. MZBT# is sampled on all MCLK edges where 
MSEL# is Inactive. Once MSEL# Is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK is used for the next transfer. Again, 
MZBT # Is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 
MBRDY# is driven active by the MBC in clock 10 to 
cause the memory burst counter to be incremented 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. The MBC drives MEOC# asserted 
(clock 12) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. 

For Strobed Memory Bus Mode, MEOC# is driven 
active by the MBC (clock 5) to latch MZBT # for the 
transfer (on MEOC# falling edge), and end the cur- 
rent cycle on the memory bus (MISTB Is not neces- 
sary since this example shows a single transfer cy- 
cle). MZBT # Is driven high by the MBC in order to 
force the read cycle to begin with the correct burst 
address. 
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For the second non-cacheable read cycle, MSEL# 
is driven active by the MBC (clock 8) to allow MISTB 
operation and to latch MZBT # for the transfer (on 
MSEL# falling edge). Again, MZBT# Is driven high 
by the MBC to force the transfer to begin with the 
correct burst address. MISTB is toggled in clock 9 to 
cause the memory burst counter to be incremented, 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. Note: MISTB latches the memory 
bus data on both the rising and falling edges. The 
MBC drives MEOC# asserted (clock 13) to end the 
current cycle on the memory bus and switch memo- 
ry cycle buffers for the new cycle. MZBT # for the 
next cycle (not shown), is sampled at this time on 
the falling edge of MEOC#. 


8.2 Write Cycles 

8.2.1 WRITE HITS 


(regardless of MKEN# state) since the line is al- 
ready available in the cache. The MBC must also 
latch PWT and PCD on BLE# falling edge in order 
to track hits and misses to the [S] state. This is how 
an external state tracker can track the [S] state. 


The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 6 for the two cycles in this example) 
and remains valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 9). MALE and MBALE 
may be used to hold the address as necessary. 


The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 2), indicating that the cycle Is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT # asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 



8.2.1.1 Write Hit to [E] or [M] States 

CPU Initiated write cycles which hit 82495XP entries 
tagged in the [E] or [M] states are executed com- 
pletely within the CPU /Cache core, and will not be 
seen by the MBC. 


8.2. 1.2 Write Hit to [S] State 

Figure 8.4 illustrates CPU Initiated write cycles which 
hit lines in the 82495XP/82490XP cache array that 
are In the shared state. If the 82495XP/82490XP is 
used as a write through cache (not write back), the 
[S] state is the only state a cached line could be in. 
These cycles are posted as are all normal write cy- 
cles (as long as no other write miss is pending). 

CACHE CONTROL SIGNALS: 

The CPU initiates the write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a hit to shared state, it posts the write 
and returns BRDY# to the CPU. 

The 82495XP next issues a cycle request (CADS# 
in clock 1), and the associated cycle control signals 
to the MBC (eg. CW/R#, CM/10#, CD/C#, 
RDYSRC, MCACHE#, PALLC#) in order to sched- 
ule the write through operation. MCACHE# is not 
active since the write will be posted; RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY# to the CPU; PALLC# is not active, indicat- 
ing that an allocation cycle will not be performed 


CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. KWEND# is 
also driven at this time since the cacheability of this 
cycle is already known and MKEN# is a don’t care. 
It is not necessary that KWEND# be asserted at this 
time. 

The 82495XP provides BRDY# to the CPU since 
the cycles are posted writes. The MBC completes 
the first write hit to [S] state in clock 5 when it as- 
serts CRDY# to the 82495XP/82490XP cache. The 
data is latched in to the 82490XP array from .the 
memory cycle buffer at this time. 

In this example, the 82495XP issues a second write 
to [S] state in clock 6. For this cycle, the 82495XP 
Issues the memory bus request (CADS#) as soon 
as it can after sampling CNA# asserted. The 
82495XP will not wait for KWEND# (if It does not 
get asserted Immediately as in this example) to is- 
sue CADS# since this is not a potential allocate cy- 
cle (le. PALLC# active). 

The MBC asserts BGT#, CNA#, and KWEND# to- 
gether In clock 8 to indicate that the current cycle Is 
guaranteed to complete and the 82495XP is free to 
schedule a new memory bus cycle. 

Again, the 82495XP provides BRDY# to the CPU 
since the cycles are posted writes. The MBC com- 
pletes the second write hit to [S] state in clock 1 2 
when it asserts CRDY# to the 82495XP/82490XP 
cache. The data is latched in to the 82490XP array 
from the memory cycle buffer at this time. 
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MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable signal (MDOE#) is asserted by the 
MBC In clock 2 to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 4) to latch 
MZBT # for the transfer, and end the current cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer cycle). 
MZBT # is driven high by the MBC in order to force 
the write cycle to begin with the correct burst ad- 
dress . MFRZ# Is sampled here (it need not be ac- 
tive since the cycle is not potentially allocatable). 

For the second write through cycle, MSEL# is driv- 
en active by the MBC (clock 7) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK Is used for the next transfer. Again, 
MZBT # is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 
MBRDY# is driven active by the MBC in clock 10 to 
cause the memory burst counter to be incremented 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. The MBC drives MEOC# asserted 
(clock 12) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) is asserted by the MBC in 
clock 2 to drive the memory data outputs. 

MEOC# is driven active by the MBC (clock 4) to 
latch MZBT# for the transfer (on MEOC# falling 
edge), and end the current cycle on the memory bus 
(MOSTB is not necessary since this example shows 
a single transfer cycle). MZBT # is driven high by the 
MBC in order to force the read cycle to begin with 
the correct burst address. 

For the second write through cycle, MSEL# is driv- 
en active by the MBC (clock 6) to allow MOSTB op- 
eration and to latch MZBT# for the transfer (on 
MSEL# falling edge). Again, MZBT# is driven high 
by the MBC to force the transfer to begin with the 


correct burst address. MOSTB is toggled in clock 9 
to cause the memory burst counter to be increment- 
ed, and data to be placed into the 82490XP cache 
memory cycle buffers. Note: MOSTB latches the 
memory bus data on both the rising and falling edg- 
es. The MBC drives MEOC# asserted (clock 11) to 
end the current cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT # for 
the next cycle (not shown), Is sampled at this time 
on the falling edge of MEOC#. 

8.2.2 WRITE MISSES 


8.2.2. 1 Write Miss with no Allocation 

Figure 8.5 Illustrates two CPU initiated write cycles 
which miss the 82495XP/82490XP cache and are 
not allocatable. The first write cycle begins as a po- 
tentially allocatable cycle, but MKEN# sampled in- 
active indicates that the cycle is not cacheable by 
the memory bus. The second write miss cycle is not 
cacheable by the CPU/82495XP/82490XP as indi- 
cated by the PALLC# output from the 82495XP. 

CACHE CONTROL SIGNALS: 

The CPU initiates the first write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss. It Issues a cycle re- 
quest (CADS# in clock 1) and the associated cycle 
control signals to the MBC (eg. CW/R#, CM/10#, 
CD/C#, RDYSRC, MCACHE#, PALLC#) In order 
to schedule the write miss operation. RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY# to the CPU; MCACHE# is not active; 
PALLC# is active, indicating that the cycle is poten- 
tially allocatable. 

The write miss data is posted in the 82490XP’s 
memory cycle buffer, and the cycle completes with 
no wait states to the CPU. The CPU is then free to 
issue another (non-related) cycle while the 82495XP 
completes the current write miss cycle and possible 
allocation. If this new cycle is a cache hit, it will be 
serviced by the 82495XP immediately; but if it is a 
cache miss, its service will wait until the CRDY# of 
the write cycle (and allocation cycle, if executed). 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:a]) is valid with CADS# 
(clocks 1 and 7 for the two cycles in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 10). MALE and MBA- 
LE may be used to hold the address as necessary. 
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The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the write 
through cycle Is guaranteed to complete on the 
memory bus. Once the 82495XP samples BGT # as- 
serted, it must finish that cycle on the memory bus. 
Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# Is asserted by the MBC (clock 3) to indicate 
that It is ready to schedule a new memory bus cycle. 
Notice that the cycle control signals are not guaran- 
teed to be valid after CNA# activation. NOTE that 
CNA# has no effect before KWEND#. 

When the MBC has determined the cacheability at- 
tribute of the write through cycle, it drives the 
MKEN# signal accordingly. The MBC also drives 
the KWEND# signal at this time (clock 4), indicating 
the end of the cacheability window. The 82495XP 
samples MKEN# inactive during KWEND#, indicat- 
ing that the missed cycle is not cacheable and 
should not be allocated. 

The MBC asserts SWEND# (clock 6) when the 
snoop window of the write through cycle ends on the 
memory bus. The MBC may return CRDY# to the 
82495XP/82490XP cache any time after the closure 
of the snoop window. In this example, CRDY# is 
issued by the MBC in clock 8. 

The 82495XP issues a cycle request for the second 
write miss cycle in clock 7. The cycle control signals 
are valid at this time. Note that PALLC# is inactive, 
indicating that the 82495XP/82490XP has deter- 
mined the cycle to not be allocatable. 

The MBC# asserts BGT#, CNA#, and KWEND# in 
clock 9. MKEN# is a don’t care during the cachda- 
bility window since the cycle is pot allocatable. The 
snoop window Is closed in clock 1 1 , and the cycle is 
completed on the memory bus in clock 1 3 with the 
assertion of CRDY# by the MBC. 

MEMORY BUS SIGNALS; 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) is asserted by the MBC in 
clock 4 to drive the memory data outputs. 


MEOC# is asserted by the MBC (clock 5) to latch 
MZBT # for the transfer, and end the current cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer cycle). 
MZBT # is driven high by the MBC in order to force 
the read cycle to begin with the correct burst ad- 
dress. MFRZ# is sampled here (it need not be ac- 
tive since the cycle is not potentially allocatable). 

For the second non allocatable write cycle, MSEL# 
is driven active by the MBC (clock 8) to allow sam- 
pling of MBRDY# and to latch MZBT # for the traris- 
fer. MZBT # is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# Is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK Is used for the next transfer. Again, 
MZBT # is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 
MBRDY# is driven active by the MBC In clock 10 to 
cause the memory burst counter to be incremented 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. 

The MBC drives MEOC# asserted (clock 13) to end 
the current cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MFRZ# is 
sampled here (It need not be active since the cycle 
is not potentially allocatable). MZBT # is also sam- 
pled at this time. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) is asserted by the MBC in 
clock 2 to drive the memory data outputs. 

MEOC# is driven active by the MBC (clock 5) to 
latch MZBT # for the transfer, and end the current 
cycle on the memory bus (MOSTB is not necessary 
since this example shows a single transfer cycle). 
MZBT # is driven high by the MBC In order to force 
the read cycle to begin with the correct burst ad- 
dress. 

For the second write through cycle, MSEL# is driv- 
en active by the MBC (clock 8) to allow MOSTB op- 
eration and to latch MZBT # for the transfer. Again, 
MZBT # is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. MOSTB 
is toggled in clock 12 to cause the memory burst 
counter to be incremented, and data to be read from 
the 82490XP cache memory cycle buffers. Note: 
MOSTB latches the memory bus data on both the 
rising and falling edges. 

The MBC drives MEOC# asserted (clock 13) to end 
the current cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT# 
and MFRZ# for the next cycle (not shown), is sam- 
pled at this time on the falling edge of MEOC#. 
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S.2.2.2 Write Miss with Allocation 

Figure 8.6 illustrates a CPU initiated write cycle 
which misses the 82495XP/82490XP cache and fol- 
lows the write to main memory with an allocation 
cycle. An allocation is when the cache follows a 
write miss cycle with a line fill. This example as- 
sumes that allocating the new line requires the re- 
placement of a modified line (ie. a write-back to main 
memory). 

CACHE CONTROL SIGNALS: 

The CPU initiates the write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it Issues CADS# 
(clock 1 ) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#, PALLC#) in order to schedule the write 
operation. MCACHE# Is not active; RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY#s to the CPU; PALLC# is asserted. Indicat- 
ing a potential allocate cycle after the write-through 
cycle. 

The write miss data is posted in the 82490XP’s 
memory cycle buffer, and the cycle completes with 
no wait states to the CPU. The CPU is free to issue 
another (non-related) cycle while waiting for the 
82495XP to complete the allocation. If this new cy- 
cle is a cache hit, it will be serviced by the 82495XP 
immediately; but if it is a cache miss, its service will 
wait until the CRDY# of the allocation. 

The memory bus address (MSET[10:0], 
MTAG[11:0l, MCFA[6:0]) is valid with CADS# 
(clocks 1 , 5 and 1 0 for the three cycles in this exam- 
ple) and remain valid until after CNA# is sampled 
active by the 82495XP (clocks 4, 10 and 15). MALE 
and MBALE may be used to hold the address as 
necessary. 

The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 2), indicating that the write 
through cycle is guaranteed to complete on the 
rnemory bus. Once the 82495XP samples BGT # as- 
serted, it must finish that cycle on the memory bus. 
Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the write through cycle, it drives the 


MKEN# signal accordingly. The MBC also drives 
the KWEND# signal at this time, indicating the end 
of the cacheability window. The 82495XP samples 
MKEN# active during KWEND# (clock 4), indicat- 
ing that the missed line should be allocated in the 
cache. 

At the first available time (clock 5), the 82495XP as- 
serts CADS# to request an allocation cycle. The cy- 
cle control signals are valid at this point: MCACHE# 
Is active, indicating the cacheability of the line-fill cy- 
cle; RDYSRC Is not active, indicating that the MBC 
need not supply BRDY#s to the CPU (no BRDY#s 
are necessary for an allocation cycle). 

The MBC asserts SWEND# (clock 6) when the 
snoop window of the write through cycle ends on the 
memory bus. 

The MBC may return CRDY# to the 82495XP/ 
82490XP cache any time after the closure of the 
snoop window. In this example, CRDY# is issued by 
the MBC in clock 8. At this time, the cycle progress 
signals for the allocation cycle may be issued by the 
MBC to complete the line fill. 

Once again, the MBC arbitrates for the memory bus 
and returns BGT # asserted (clock 9) for the alloca- 
tion cycle. The MBC also asserts CNA# and 
KWEND# at this time. The 82495XP back-invall- 
dates the CPU to maintain first and second level 
cache consistency. 

In clock 10, the 82495XP asserts CADS# for the 
write back cycle (since the miss was to a dirty line). 
CDTS# is asserted by the 82495XP two clocks later 
(clock 12). Note that CDTS# of the write back cycle 
is not asserted with CADS# since the data is not yet 
available in the 82490XP’s write-back buffer. 

The MBC asserts SWEND# (clock 11) when the 
snoop window of the allocation cycle end on the 
memory bus. 

At this time, the MBC may assert CRDY# to the 
82495XP/82490XP cache for the allocation cycle. 
CRDY# assertion will cause the data stored in the 
82490XP’s memory cycle buffers to be latched into 
the cache array. 

On the memory bus, BGT#, CNA#, and KWEND# 
are sampled active in clock 14 for the write back 
cycle. The snoop window is closed two clocks later 
(clock 16) by the MBC with SWEND#, and the write 
back cycle Is completed with CRDY# asserted in 
clock 18. 
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MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 4) to latch 
MZBT # for the transfer, and end the current cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer write 
miss cycle). MZBT # is driven high by the MBC in 
order to force the read cycle to begin with the cor- 
rect burst address. MFRZ# is driven inactive by the 
MBC here, allowing the line to be placed into the 
exclusive ([E]) state and requiring the data to be 
written to main memory. 

For the allocation (line fill) cycle, MSEL# is driven 
active by the MBC (clock 6) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# Is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK Is used for the next transfer. 
MDOE# is also deasserted in clock 6 to allow the 
data pins to be used as inputs for the allocation cy- 
cle. 

MBRDY# is driven active by the MBC In clocks 7 to 
9 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 10) to end the allocation 
cycle on the memory bus and switch memory cycle 
buffers for the new cycle. MZBT # is sampled and 
latched at this time for the next data transfer. 

MDOE# Is asserted by the MBC (clock 12) to drive 
the memory data outputs for the write back cycle. 

The MBC again asserts MBRDY# (clocks 13 to 15) 
for the write back cycle to increment the memory 
burst counter and cause data to be read from the 
82490XP memory cycle buffers. The write back cy- 
cle ends on the memory bus and switches memory 
cycle buffers with MEOC# assertion (clock 16). 
MZBT# and MFRZ# for the next transfer are sam- 
pled at this time. MFRZ# need not be active since 
the cycle is not potentially allocatable. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs for the write 
miss cycle. 


MEOC# is driven active by the MBC (clock 4) to 
latch MZBT # for the transfer, and end the current 
cycle on the memory bus (MOSTB is not necessary 
since this example shows a single transfer cycle). 
MZBT # is driven high by the MBC In order to force 
the read cycle to begin with the correct burst ad- 
dress. MFRZ# is driven deasserted by the MBC 
here, allowing the line to be placed into the exclu- 
sive ([E]) state. 

For the allocation (line fill) cycle, MSEL# is driven 
active by the MBC (clock 6) to allow MISTB opera- 
tion and to latch MZBT# for the transfer. MISTB Is 
toggled in clocks 8 to 10 to cause the memory burst 
counter to be incremented, and data to be placed 
into the 82490XP cache memory cycle buffers. 
Note: MISTB latches the memory bus data on both 
the rising and falling edges. MDOE# is also deas- 
serted in clock 6 to allow the data pins to be used as 
inputs for the allocation cycle. 

The MBC drives MEOC# asserted (clock 1 1) to end 
the allocation cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT # for 
the next cycle, is latched at this time on the falling 
edge of MEOC#. 

MDOE# is asserted by the MBC (clock 14) to drive 
the memory data outputs for the write back cycle. 

The MBC toggles MOSTB (clocks 15 to 17) for the 
write back cycle to increment the memory burst 
counter and cause data to be read from the 
82490XP memory cycle buffers. 

The write back cycle ends on the memory bus and 
switches memory cycle buffers with MEOC# asser- 
tion (clock 18). MZBT# and MFRZ# for the next 
transfer are sampled at this time. MFRZ# need not 
be active since the cycle is not potentially allocata- 
ble. 


8.3 Snooping Cycles 

8.3.1 SYNCHRONOUS SNOOPING MODE 
(HIT TO [M] LINE) 

Figure 8.7 illustrates a snoop hit to a dirty line se- 
quence occurring simultaneously with a CPU initiat- 
ed read miss cycle. This example assumes synchro- 
nous snooping mode (ie. requests for snoops are 
done via SNPSTB# from the MBC, sampled on the 
82495XP’s CLK). 


2-343 


CLX 


CADS# 

ADORES 

CDTS# 

CW/R# 

RDYSRC 

MCACHE 

BGT# 

CRDY# 

SNPADS 

SNPCYC 

MTHIT# 

MHiTMl 

SNPBSY 

SNPSTB 

SNPINV 

MApE^ 

CLOCK 

MCLK 

MSEL# 

MEOC# 

MBRDYA 

MZBT# 

MFRZ# 

MDOE# 

MDATA 

STROB 

MSEL# 

MEOC# 

MxSTB 

MZBT# 

MFRZ# 

MDOE# 


MDATA 




82495XP Cache Controller/82490XP Cache RAM 




int^. 


CACHE CONTROL SIGNALS: 

In clock 1 SNPSTB# is asserted by the MBC, Indi- 
cating to the 82495XP a request for snooping. The 
82495XP samples MAOE# (It must be inactive) In 
order to recognize the snoop request. It is latched 
together with the snoop address (MSET[0:10], 
MTAG[0:11], MCFA[0:6]), SNPINV, MBAOE#, and 
SNPNCA on the 82495XP’s CLK during SNPSTB# 
assertion. The tag look-up is done immediately after 
SNPSTB# is sampled active since snoop opera- 
tions have the highest priority in the cache tag state 
arbiter. The 82495XP issues SNPCYC# (clock 2), 
indicating that the snoop look-up is in progress. The 
results of the look-up are driven to the memory bus 
via MTHIT# and MHITM# in the next clock after 
SNPCYC#. Since the snoop hit a modified line, both 
signals are asserted (clock 3). SNPBSY# is also is- 
sued to indicate that the 82495XP is busy with CPU 
back-invalidations, the 82490XP’s snoop buffer is 
full, or a write back is to follow. The 82495XP will 
accept snoops only when SNPBSY# Is Inactive. 

Simultaneously with the memory bus activity due to 
the snoop request, the CPU initiates a read miss cy- 
cle. The 82495XP issues a memory bus request 
(CADS#), CDTS#, and cycle control signals to the 
MBC in clock 3. The MBC must wait for the pending 
snoop cycle to complete on the memory bus prior to 
servicing this read miss cycle. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is not valid until MAOE# 
goes active after CRDY# of the snoop write back 
cycle is sampled active by the 82495XP and the 
CADS# is reissued (clock 13). 

In clock 4 the 82495XP issues SNPADS# and cycle 
control signals to the MBC, indicating a request to 
flush a modified line out of the cache. SNPADS# 
activation causes the MBC to abort the pending read 
miss cycle. It is the 82495XP responsibility to re-is- 
sue the aborted cycle after the completion of the 
write back, since BGT# was not asserted by the 
MBC. 

Data is loaded Into the 82490XP’s snoop buffer. 
Since SNPINV was sampled asserted by the 
82495XP (clock 1) during SNPSTB# assertion, it 
back-invalidated the CPUs first level cache. 

The 82495XP asserts CDTS# (clock 8) indicating to 
the MBC that data is available in the snoop buffer. 
When the MBC complete the write back cycle on the 
memory bus, it activates CRDY# -to the 
82495XP/82490XP cache. At this time, the 
82495XP deasserts SNPBSY# (clock 13) and re-is- 
sues the aborted read miss cycle (clock 13) by as- 
serting CADS# and CDTS#. 


MEMORY BUS SIGNALS: 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) is not activated by the MBC 
to allow the memory data pins to be used as inputs. 

MSEL# is driven active by the MBC (clock 4) to al- 
low sampling of MBRDY# and to latch MZBT# for 
the read miss transfer. MZBT# is sampled on all 
MCLK rising edges where MSEL# is inactive. Once 
MSEL# is sampled active by the 82495XP, the val- 
ue of MZBT # sampled on the prior MCLK Is used 
for the next transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 6) and reasserted (clock 8) to 
allow latching of MZBT # for the snoop write back 
cycle and sampling of MBRDY# for that cycle. 
MFRZ# is also sampled at this time. 

The memory data output enable (MDOE#) signal is 
driven active by the MBC (clock 7) to drive the mem- 
ory data outputs. 

MBRDY# is driven active by the MBC in clocks 10 
to 1 2 to cause the memory burst counter to be incre- 
mented and data to be written from the 82490XP 
cache snoop buffers. The MBC drives MEOC# as- 
serted (clock 1 3) to end the write back cycle on the 
memory bus and switch memory cycle buffers for 
the new cycle. MZBT# and MFRZ# are sampled 
and latched at this time for the next data transfer. 

MDOE# is deasserted by the MBC (clock 14) to al- 
low the memory data pins to be used as inputs for 
the reissued read cycle. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has not been asserted by 
the MBC to allow the memory data pins to be used 
as inputs for the read miss cycle. 

MSEL# is asserted by the MBC (clock 4) to allow 
sampling of MISTB and latch MZBT # (on the falling 
edge of MSEL#) for the read miss transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 5) and reasserted (clock 6) to 
allow latching of MZBT # for the snoop write back 
cycle and sampling of MOSTB for that cycle. 
MFRZ# is also sampled at this time. 

MOSTB is toggled in clocks 11 to 1 3 to cause the 
memory burst counter to be incremented, and data 
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to be read from the 82490XP cache memory cycle 
buffers. Note: MOSTB latches the memory bus data 
on both the rising and falling edges. The MBC drives 
MEOC# asserted (clock 14) to end the snoop write 
back cycle on the memory bus and switch memory 
cycle buffers for the new cycle. MZBT# and 
MFRZ# for the next cycle, are latched at this time 
on the falling edge of MEOC#. 

MDOE# Is deasserted by the MBC (clock 14) to al- 
low the memory data pins to be used as inputs for 
the reissued read miss cycle. 

8.3.2 CLOCKED SNOOPING MODE 

Figure 8.8 illustrates a CPU Initiated Read cycle 
which misses the 82495XP/82490XP cache and the 
subsequent line fill replaces non dirty data (eg. clean 
or empty). Simultaneous with the read request to the 
MBC, that device Initiates a snoop to the 82495XP 
which misses that line in the cache. The snoop is the 
result of a write cycle on the memory bus by some 
other cache core; therefore, asserting the snoop in- 
validation signal (SNPINV) to this 82495XP. This ex- 
ample assumes Clocked Snooping Mode (I.e. the re- 
quests for snoops are done via SNPSTB# from the 
MBC, sampled on the MBC’s SNPCLK). 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1 ) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/10#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss in potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

In clock 3, SNPSTB# is asserted by the MBC at this 
time, indicating to the 82495XP a request for snoop- 
ing. MAOE# is deasserted to allow the forthcoming 
snoop (the 82495XP will not recognize the snoop If 
MAOE# is active). It is latched together with the 
snoop address (MSET[0:10], MTAG[0:11], 

MCFA[0:6]), SNPINV, MBAOE#, and SNPNCA on 
the MBC’s SNPCLK rising edge during SNPSTB# 
assertion. SNPINV is asserted from the MBC since 
the cache core which initiated the snoop issued a 
write cycle on the memory bus. If the response of 
the snoop to this 82495XP was a cache hit, the con- 
tents would no longer be valid due that write. 


Following synchronization to the 82495XP CLK, it 
issues SNPCYC# (clock 5), indicating that the 
snoop look-up Is in progress. The results of the look- 
up are driven to the memory bus via MTHIT# and 
MHITM# In the next clock after SNPCYC#. Since 
the snoop was a miss in the cache, both signals are 
inactive (clock 6). Note that SNPBSY# will not be 
asserted since the snoop was a miss to this cache. 
The snoop from another cache Is complete at this 
point, and the read miss cycle will continue. 

The MBC asserts MAOE# to allow this 82495XP to 
drive its address on the memory bus In order to com- 
plete the read miss cycle. The memory bus address 
(MSET[10:0], MTAG[11:0], MCFA[6:0]) is valid after 
MAOE# assertion # (clock 6 for the read cycle in 
this example) and remains valid until after CNA# is 
sampled active by the 82495XP (clock 8). MALE and 
MBALE may be used to hold the address as neces- 
sary. 

The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 6), Indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT # asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clock 7) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 
at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# during 
KWEND# (clock 7) to determine that the cycle is 
indeed cacheable. 

The MBC asserts SWEND# when the snoop win- 
dow ends on the memory bus. The 82495XP sam- 
ples MWB/WT# during SWEND# (clock 9) and up- 
dates the cache tag state according to the consist- 
ency protocol. The closure of the snoop window also 
enables the MBC to start providing the CPU with 
data that has been stored In the 82490XP’s memory 
cycle buffer. The MBC supplies BRDY#s to the CPU 
(clocks 9-12). 

The read miss cycle ends when CRDY# Is driven 
active by the MBC (clock 12). It Is at this time that 
the data in the 82490XP’s memory cycle buffers is 
loaded into the cache SRAM. 
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Figure 8-8. Clocked Snooping Mode 
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MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. (Note the use of 
MAOE# for snooping at the beginning of the cache 
control signals section.) MDOE# must be inactive to 
allow the data pins to be used as inputs. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP cache 
SRAM. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 6) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# Is inactive. Once MSEL# Is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK is used for the next transfer. 
MBRDY# is driven active by the MBC in clocks 7 to 
9 to cause the memory burst counter to be incre- 
mented and data to be placed Into the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 10) to end the current cycle 
on the memory bus and switch memory cycle buffers 
for the new cycle. MZBT # is sampled at this time 
(when MEOC# is sampled asserted and MSEL# re- 
mains low) for the next transfer. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 6) to allow MISTB opera- 
tion and to latch MZBT# (on the falling edge of 
MSEL#) for the transfer. MISTB is toggled in clocks 
8 to 10 to cause the memory burst counter to be 
incremented, and data to be placed into the 
82490XP cache memory cycle buffers. Note: MISTB 
latches the memory bus data on both the rising and 
falling edges. The MBC drives MEOC# asserted 
(clock 11) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. MZBT # for the next cycle, is sampled at this 
time on the falling edge of MEOC#. 


8.3.3 STROBED SNOOPING MODE 
(HIT TO [M] LINE) 

Figure 8.9 illustrates a snoop hit to a dirty line se- 
quence occurring simultaneously with a CPU initiat- 
ed read miss cycle. This example assumes strobed 
snooping mode (ie. requests for snoops are done 
from the falling edge of SNPSTB#). 


CACHE CONTROL SIGNALS: 

In clock 1 (totally asynchronous to any clock) 
SNPSTB# Is asserted by the MBC, Indicating to the 
82495XP a request for snooping. The 82495XP 
samples MAOE# (it must be Inactive) In order to 
recognize the snoop request. It is latched together 
with the snoop address (MSET[0:10], MTAG[0:11], 
MCFA[0:6]), SNPINV, MBAOE#, and SNPNCA on 
falling edge of SNPSTB#. The 82495XP issues 
SNPCYC# (clock 3), indicating that the snoop look- 
up is in progress. The results of the look-up are driv- 
en to the memory bus via MTHIT # and MHITM# in 
the next clock after SNPCYC#. Since the snoop hit 
a modified line, both signals are asserted (clock 4). 
SNPBSY# is also issued to indicate that the 
82495XP is busy with CPU back-invalidations, the 
82490XP’s snoop buffer is full, or a write back is to 
follow. The 82495XP will accept snoops only when 
SNPBSY# is Inactive. 

Simultaneously with the memory bus activity due to 
the snoop request, the CPU initiates a read miss cy- 
cle. The 82495XP Issues a memory bus request 
(CADS#), CDTS#, and cycle control signals to the 
MBC in clock 1. The MBC must wait for the pending 
snoop cycle to complete on the memory bus prior to 
servicing this read miss cycle. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is not valid until MAOE# 
goes active after CRDY# of the snoop write back 
cycle is sampled active by the 82495XP and the 
CADS# is reissued (clock 15). 

In clock 5 the 82495XP issues SNPADS# and cycle 
control signals to the MBC, indicating a request to 
flush a modified line out of the cache. SNPADS# 
activation causes the MBC to abort the pending read 
miss cycle. It is the 82495XP responsibility to re-is- 
sue the aborted cycle after the completion of the 
write back, since BGT# was not asserted by the 
MBC. 

Data is loaded into the 82490XP’s snoop buffer. 
Since SNPINV was sampled asserted by the 
82495XP (clock 1) during SNPSTB# assertion, it 
back-invalidated the CPUs first level cache. 

The 82495XP asserts CDTS# (clock 9) indicating to 
the MBC that data is available in the snoop buffer. 
When the MBC complete the write back cycle on the 
memory bus, it activates CRDY# to the 
82495XP/82490XP cache. At this time, the 
82495XP deasserts SNPBSY# (clock 15) and re-is- 
sues the aborted read miss cycle by asserting 
CADS# and CDTS#. 
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MEMORY BUS SIGNALS: 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) is not activated by the MBC 
to allow the memory data pins to be used as inputs. 

MSEL# is driven active by the MBC (clock 2) to al- 
low sampling of MBRDY# and to latch MZBT# for 
the read miss transfer. MZBT# is sampled on all 
MCLK rising edges where MSEL# is inactive. Once 
MSEL# Is sampled active by the 82495XP, the val- 
ue of MZBT # sampled on the prior MCLK is used 
for the next transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 9) and reasserted (clock 11) to 
allow latching of MZBT # for the snoop write back 
cycle and sampling of MBRDY# for that cycle. 
MFRZ# is also sampled at this time. 

The memory data output enable (MDOE#) signal is 
driven active by the MBC (clock 9) to drive the mem- 
ory data outputs. 

MBRDY# is driven active by the MBC in clocks 11 
to 1 3 to cause the memory burst counter to be incre- 
mented and data to be written from the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 14) to end the write back 
cycle on the memory bus and switch memory cycle 
buffers for the new cycle. MZBT# and MFRZ# are 
sampled and sampled at this time for the next data 
transfer. 

MDOE# is deasserted by the MBC (clock 16) to al- 
low the memory data pins to be used as inputs for 
the reissued read cycle. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has not been asserted by 
the MBC to allow the memory data pins to be used 
as inputs for the read miss cycle. 

MSEL# is asserted by the MBC (clock 2) to allow 
sampling of MISTB and latch MZBT# (on the falling 
edge of MSEL#) for the read miss transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 9) and reasserted (clock 11) to 
allow latching of MZBT # for the snoop write back 
cycle and sampling of MOSTB for that cycle. 
MFRZ# is also sampled at this time. 

MOSTB Is toggled in clocks 12 to 14 to cause the 
memory burst counter to be incremented, and data 


to be read from the 82490XP cache memory cycle 
buffers. Note: MOSTB latches the memory bus data 
on both the rising and falling edges. 

The MBC drives MEOC# asserted (clock 15) to end 
the snoop write back cycle on the memory bus and 
switch memory cycle buffers for the new cycle. 
MZBT# and MFRZ# for the next cycle, are sam- 
pled at this time on the falling edge of MEOC#. 

MDOE# is deasserted by the MBC (clock 16) to al- 
low the memory data pins to be used as inputs for 
the reissued read miss cycle. 

8.3.4 CACHE TO CACHE TRANSFER 

8.3.4. 1 Read Cycles Causing a Snoop Hit 
to [M] Line 

Figure 8.10 illustrates CPU initiated Read cycles that 
miss the 82495XP/82490XP cache and replace a 
non-dirty (eg. clean) line in the cache. During the 
snoop window, the memory bus attribute which indi- 
cates a direct to [M] state transfer is sampled active. 
In such cycles, the 82495XP will instruct the MBC to 
perform a cache line-fill cycle on the memory bus. 
The request for data will not go to main memory, but 
Instead will go to the controller of the cache which 
contained the modified data. The line Is then written 
into the 82490XP’s array, and data transferred to the 
CPU as requested. If the line fetched from the sec- 
ond cache replaces a line which is in valid unmodi- 
fied state ([E] or [S]), then a back-invalidation cycle 
is performed on the CPU bus to guarantee that the 
replaced data is also removed from the CPU’s first 
level cache, thus maintaining the inclusion property. 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 2) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/10#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, Indicating that the 
read miss is potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 2 and 1 3 for the two read miss cycles in this 
example) and remain valid until after CNA# is sanrj- 
pled active by the 82495XP (clocks 5 and 16). MALE 
and MBALE may be used to hold the address as 
necessary. 
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The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 3), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT # asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clock 4) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle. It drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 
at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# and 
MRO# during KWEND# (clock 5) to determine that 
the cycle is indeed cacheable. 

The MBC asserts SWEND# when the snoop win- 
dow ends on the memory bus. The 82495XP sam- 
ples MWB/WT# and DRCTM# during SWEND# 
(clock 7) and updates the cache tag state according 
to the consistency protocol. Since the result of the 
snoop was a hit to a modified line in another cache, 
the MBC asserts DRCTM# at this time (this Is an 
option to save time by skipping the main memory 
access, not a requirement of the memory bus) so 
that the tag state will go immediately to the [M] 
state, skipping the [E] state. MWB/WT# must be in 
write back mode (high) to assure this transition. The 
closure of the snoop window also enables the MBC 
to start providing the CPU with data that has been 
stored in the 82490XP’s memory cycle buffer. The 
MBC supplies BRDY#s to the CPU (clocks 7-10). 

The 82495XP issues a new CADS# in clock 13, 
which also misses the 82495XP/82490XP cache. 
Since the 82495XP has already sampled CNA# as- 
serted (clock 4), It issues a new CADS# prior to 
receiving CRDY# of the current cycle (ie. this cycle 
is pipelined within the MBC). Note that once the cy- 
cle progress signals (BGT#, CNA#, KWEND#, 
SWEND#) of a cycle are sampled asserted, the 
82495XP ignores them until the CRDY# of that cy- 
cle. The 82495XP does not pipeline the cycle prog- 
ress signals (ie. it will not sample them again until 
after CRDY# of the current memory bus cycle). 

MEMORY BUS CYCLES: 

At the start of this cycle, the master 82495XP does 
not know that the data will be coming from a slave 
82495XP/82490XP and begins a read request to 
main memory to obtain the required data. Since the 


snoop resulted in a hit to a modified line In the sec- 
ond cache, the memory request must be backed off 
so that the snooped 82495XP may supply the data. 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. The memory data 
output enable signal (MDOE#) must remain inactive 
to allow the data pins to be used as inputs. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is Inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK Is used for the next transfer. 

MBRDY# is driven active in clocks 4 to 10 to read 
data into the 82490XP cache memory cycle buffers. 
The MBC asserts MEOC# (clock 11) to end the 
read miss cycle on the memory bus and switch the 
memory cycle buffers for a new cycle. MZBT# Is 
latched at this time for the next transfer. Note that 
there are 8 transfers needed to fill the 
82495XP/82490XP cache line and only 4 needed 
for the CPU line fill. 

MBRDY# is again driven active by the MBC in 
clocks 11 to 21 to cause the memory burst counter 
to be incremented and data to be placed into the 
82490XP cache memory cycle buffers for the sec- 
ond read miss cycle. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow MISTB opera- 
tion and to latch MZBT# for the transfer (on the 
falling edge of MSEL#). MISTB is toggled in clocks 
5 to 1 1 to cause the memory burst counter to be 
incremented, and data to be placed into the 
82490XP cache memory cycle buffers. Note: MISTB 
latches the memory bus data on both the rising and 
falling edges. The MBC drives MEOC# asserted 
(clock 12) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. MZBT # for the next cycfe is latched at this 
time on the falling edge of MEOC#. 

The MBC toggles MISTB (clocks 16 to 21) for the 
second read miss cycle to increment the memory 
burst counter and cause data to be written into the 
82490XP memory cycle buffers. 
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Figure 8-11. Read For Ownership 
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8.4 Read for Ownership 

8.4.1 WRITE MISS WITH MFRZ# ASSERTED, 
FOLLOWED BY READ TO SAME LINE 

Figure 8-1 1 illustrates a Read For Ownership cycle. 
First, a CPU initiates a write cycle which misses the 
82495XP/82490XP cache. The MBC issues a “dum- 
my” write to main memory (the write does not actu- 
ally go out to main memory - to save valuable bus 
time). The 82490XP MFRZ# input is used by the 
MBC to indicate that the following line-fill (allocation) 
data (from either main memory or another cache) 
should be merged with the data of the write miss. 
The entire line is then placed Into the Internal ta- 
gram. 

CACHE CONTROL SIGNALS: 

The CPU initiates a write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/10#, CD/C#, RDYSRC, 
MCACHE#, PALLC#) In order to schedule the write 
operation. MCACHE# is not active; RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY#s to the CPU; PALLC# is active, indicating a 
potential allocate cycle after the write through cycle. 

The write miss data is posted In the 82490XP’s 
memory cycle buffer, and the cycle completes with 
no wait states to the CPU. The CPU is free to issue 
another (non-related) cycle while the 82495XP is 
processing the allocation. If this new cycle is a 
cache hit, it will be serviced by the 82495XP Immedi- 
ately; but If It is a cache miss, its service will wait 
until the CRDY# of the allocation. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 5 for the write miss and allocation cy- 
cle in this example) and remain valid until after 
CNA# is sampled active by the 82495XP (clocks 4 
and 10). MALE and MBALE may be used to hold the 
address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 2), indicating that the write 
through cycle is guaranteed to complete on the 
memory bus. Once the 82495XP samples BGT # as- 
serted, it must finish that cycle on the memory bus. 
Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 


When the MBC has determined the cacheability at- 
tribute of the write through cycle, it drives the 
MKEN# signal accordingly. The MBC also drives 
the KWEND# signal at this time, indicating the end 
of the cacheability window. The 82495XP samples 
MKEN# active during KWEND# (clock 3), indicat- 
ing that the missed line should be allocated in the 
cache. 

The MBC asserts SWEND# (clock 5) when the 
snoop window of the write through cycle ends on the 
memory bus. Note that the direct to [M] state qualifi- 
er signal (DRCTM#) is sampled during SWEND# 
and is inactive for the write . The MBC also Issued 
CRDY# to the 82495XP at this time so that the 
82495XP thinks the write cycle completed on the 
memory bus when, in fact, it did not. 

In this example, the 82495XP requests the allocation 
cycle by issuing CADS# in clock 5. The cycle con- 
trol signals are valid at this point: MCACHE# is ac- 
tive, indicating the cacheability of the line-fill cycle; 
RDYSRC is not active, indicating that the MBC need 
not supply BRDY#s to the CPU (no BRDY#s are 
necessary for an allocation cycle). 

Once again, the MBC arbitrates for the memory bus 
and returns BGT # asserted (clock 6) for the alloca- 
tion cycle. The MBC asserts CNA#, KWEND#, and 
SWEND# (clock 9) to pipeline the memory bus and 
close the cacheability and snoop windows. Note that 
(for this example) DRCTM# is asserted during 
SWEND# to place the line in the modified state. 
Since this is done, all other caches must invalidate 
their copies. 

CRDY# for the allocation (line-fill) cycle is issued by 
the MBC in clock 1 1 to complete the read cycle on 
the memory bus and place the data into the 
82490XP cache array. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches In the flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs. 

The MBC asserts MSEL# (clock 2) to allow sam- 
pling of MBRDY# and to latch MZBT# and MFRZ# 
for the write. MBRDY# and MEOC# are asserted 
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by the MBC (clock 3) to place the write data into the 
memory cycle buffers, sample MZBT# and MFRZ# 
for the next transfer, and end the current cycle on 
the memory bus. MFRZ# is driven active by the 
MBC here, indicating to the 82495XP that the data 
of the write through will be merged with the following 
allocation data. 

For the allocation (line fill) cycle, MSEL# is driven 
active again by the MBC (clock 6) to allow sampling 
of MBRDY# and to latch MZBT# for the transfer. 
MZBT# Is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK is used for the next transfer. 
MDOE# Is also deasserted in clock 6 to allow the 
data pins to be used as inputs for the allocation cy- 
cle. 

MBRDY# is driven active by the MBC in clocks 7 to 
9 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
cache memory cycle buffers. During the line fill, the 
82490XP will merge the data from the write through 
buffer with the Incoming data from either main mem- 
ory or another cache (if that line was a write hit to 
[M] in another cache). 

The MBC drives MEOC# asserted (clock 10) to end 
the allocation cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT # is 
sampled at this time for the next data transfer. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs. 

The MBC asserts MSEL# (clock 2) to allow toggling 
of MISTB and to latch MZBT# and MFRZ# for the 
write (on MSEL# falling edge). MISTB is toggled 
and MEOC# asserted by the MBC (clock 2) to place 
the write data into the memory cycle buffers, sample 
MZBT# and MFRZ# for the next transfer (on the 
falling edge of MEOC# while MSEL# is active), and 
end the current cycle on the memory bus. MFRZ# Is 
driven active by the MBC here, indicating to the 
82495XP that the data of the write through will be 
merged with the following allocation data. 

For the allocation (line fill) cycle, MSEL# is driven 
active again by the MBC (clock?) to allow sampling 
of MOSTB and to latch MZBT# for the transfer. 
MDOE# is also deasserted in clock 7 to allow the 
data pins to be used as inputs for the allocation cy- 
cle. 

MOSTB is toggled by the MBC in clocks 8 to 10 to 
cause the memory burst counter to be incremented 


and data to be placed into the 82490XP cache mem- 
ory cycle buffers. During the line fill, the 82490XP 
will merge the data from the write through buffer with 
the incoming data from either main memory or an- 
other cache (if that line was a write hit to [M] in 
another cache). 

The MBC drives MEOC# asserted (clock 11) to end 
the allocation cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT # is 
sampled at this time for the next data transfer. 


8.5 I/O Cycles 


Figure 8-12 illustrates CPU initiated I/O cycles, both 
read and write. I/O writes are the only write cycles 
not posted by the 82495XP/82490XP cache (ie. the 
cycle is not fully acknowledged to the CPU until it 
has completed on the memory bus). 



CACHE CONTROL SIGNALS: 


The CPU initiates an I/O write cycle to the 
82495XP/82490XP. The 82495XP then Issues 
CADS# and CDTS# (clock 1) and the associated 
cycle control signals to the MBC (eg. CW/R#, CM/ 
10#, CD/C#, RDYSRC, MCACHE#). MCACHE# in 
not active, indicating that the cycle is not cacheable; 
RDYSRC Is active, indicating that the MBC must 
supply BRDY#s to the CPU/Cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 10 for the two read s in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 6 and 17). MALE and MBA- 
LE may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT # asserted (clock 2) for the I/O write cycle, in- 
dicating that the cycle is guaranteed to complete on 
the memory bus. Once the 82495XP samples BGT # 
asserted, it must finish that cycle on the memory 
bus. Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# for the write cycle is asserted by the MBC 
(clock 5) to indicate that it is ready to schedule a 
new memory bus cycle. Note that SWEND# and 
KWEND# are not needed for I/O cycles since they 
are not cacheable. 


The MBC asserts BRDY# in clock 7 to complete the 
I/O write cycle from the CPU, and CRDY# in clock 8 
to complete the cycle on the memory bus from the 
82495XP/82490XP cache. 
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Figure 8-12. I/O Write and Read Cycles 
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A new CADS# is issued from the 82495XP in clock 
10 for an I/O read cycle, along with the associated 
cycle control signals. MCACHE# is again not active, 
and RDYSRC is again active. 

The MBC returns BGT# asserted right away (clock 

11) . The 82495XP can pipeline I/O cycles, but does 
not for the I/O read in this example. 

Upon completing the access on the memory bus, 
the MBC activates BRDY# (clock 17) and CRDY# 
(clock 16). Note that BRDY# of a cycle may come 
before (as in the I/O write cycle of this example), 
with or after the CRDY# of the same cycle. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, The memory data 
output enable signal (MDOE#) Is asserted by the 
MBC in clock 3 to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 5) to latch 
MZBT # for the I/O write transfer, and end that cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer cycle). 
MZBT # is driven high by the MBC in order to force 
the write cycle to begin with the correct burst ad- 
dress. MFRZ# is also sampled here (it need not be 
active since the cycle is not potentially allocatable). 

For the I/O read cycle, MDOE# is deasserted (clock 

12) by the MBC to allow the data pins to be used as 
inputs. 

MSEL# is driven active by the MBC (clock 12) to 
allow sampling of MBRDY# and to latch MZBT# for 
the transfer. MZBT # is sampled on all MCLK edges 
where MSEL# is inactive. Once MSEL# Is sampled 
active by the 82495XP, the value of MZBT # sam- 
pled on the prior MCLK is used for the next transfer. 
Again, MZBT# Is driven high by the MBC to force 
the transfer to begin with the correct burst address. 

The MBC asserts MBRDY# (clock 14) to cause the 
memory burst counter to be incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers. The MBC drives MEOC# asserted (clock 
15) to end the read cycle on the memory bus and 
switch memory cycle buffers for a new cycle. 
MZBT # for the next transfer is latched at this time. 


For Strobed Memory Bus Mode, The memory data 
output enable signal (MDOE#) has been asserted 
by the MBC to drive the memory data outputs. 

MEOC# Is asserted by the MBC (clock 5) to latch 
MZBT # for the I/O write transfer (on MEOC# falling 
edge), and end that cycle on the memory bus 
(MOSTB Is not necessary since this example shows 
a single transfer cycle). MZBT # Is driven high by the 
MBC in order to force the write cycle to begin with 
the correct burst address. MFRZ# Is also sampled 
here (it need not be active since the cycle is not 
potentially allocatable). 

For the I/O read cycle, MDOE# is deasserted (clock 
10) by the MBC to allow the data pins to be used as 
inputs. 

MSEL# Is driven active by the MBC (clock 10) to 
allow operation of MISTB and to latch MZBT # for 
the transfer (on MSEL# falling edge). Again, 
MZBT # is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 

The MBC toggles MISTB (clock 15) to cause the 
memory burst counter to be incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers for the I/O read cycle. Note: MISTB latches 
the memory bus data on both the rising and falling 
edges. The MBC drives MEOC# asserted (clock 16) 
to end the read cycle on the memory bus and switch 
memory cycle buffers for a new cycle. MZBT # for 
the next transfer is latched at this time (on the falling 
edge of MEOC#). 


8.6 LOCKed Cycles 

8.6.1 CPU READ MODIFY WRITE CYCLES 

The 82495XP provides a facility to allow atomic ac- 
cesses requested by the CPU (via CPU LOCK# acti- 
vation) through the 82495XP KLOCK# signal. Fig- 
ure 8-13 illustrates two back-to-back CPU Initiated 
Locked read-modify-write cycles. KLOCK# activa- 
tion indicates to the MBC that the memory bus 
should not be released between the KLOCKed cy- 
clqs. KLOCK# will remain asserted from the begin- 
ning of the first cycle (with CADS#) until one clock 
after the CADS of the last cycle. The 82495XP does 
not distinguish between back-to-back locked opera- 
tions and will not open an arbitration window (deas- 
sert KLOCK#) between them. It is the responsibility 
of the MBC to distinguish between the multiple RMW 
sequences, if it is so desired. 
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Figure 8-13. LOCKed Read-Modify-Write Cycles 
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The 82495XP issues a request for a memory bus 
access (CADS#) for every locked cycle (read or 
write) regardless if it hits the cache tag state or not. 
Locked read cycles are treated by the 82495XP as 
cache misses, and , if the line is in the [M] state, the 
82495XP ignores the data on the memory bus and 
uses the data in the 82490XP array. Locked write 
cycles are treated as write through, and the tag state 
does not change even if the line is in the 82490XP 
array. 

CACHE CONTROL SIGNALS; 

The CPU initiates a Locked read cycle to the 
82495XP/82490XP cache where, due to the asser- 
tion of CPU LOCK#, it assumes a cache miss and 
issues CADS# to the MBC (clock 1) along with the 
associated cycle control signals (eg. CW/R#, CM/ 
lO#, CD/C#, RDYSRC, MCACHE#). MCACHE# is 
never asserted for LOCKed cycles; RDYSRC is ac- 
tive, indicating that the MBC must supply BRDY# to 
the CPU/Cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0l) is valid with CADS# 
(clocks 1 and 5, then 7 and 1 1 for the two locked 
RMW sequences in this example) and remain valid 
until after CNA# is sampled active by the 82495XP 
(clocks 3 and 7, then 9 and 13). MALE and MBALE 
may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), Indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT # asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# for the read cycle is also asserted by the 
MBC (clock 2) to indicate that It may schedule a new 
memory bus cycle. Note that the cycle control sig- 
nals are not guaranteed to be valid after CNA# acti- 
vation. 

The MBC asserts BRDY# to the CPU/Cache core 
in clock 4. CRDY# for the locked read cycle is as- 
serted to the 82495XP/82490XP from the MBC 
(clock 5) to load the data stored in the 82490XP’s 
memory cycle buffers Into the cache array. If the 
read was to a dirty line, the 82495XP is intelligent 
enough to ignore the data In the memory cycle buff- 
ers and use the data in the cache array. 


the 82495XP issues a new memory cycle request 
(CADS# in clock 5) for the Locked write as soon as 
it completes the Locked read cycle. The cycle con- 
trol signals are also valid at this time. RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY# to the CPU. 


The locked write cycle is posted like any other mem- 
ory write cycle. 


In this example, the CPU initiates a second read- 
modify-wrlte cycle Immediately. KLOCK# is not 
deasserted between the back-to-back locked se- 
quences since the CPU LOCK# remains asserted. If 
snooping is required between these cycles, it is the 
MBC responsibility to predict this boundary and al- 
low snooping. The 82495XP issues a memory bus 
request (CADS#) in clock 7 for the second locked 
read cycle, along with the new cycle control signals. 



The second locked RMW sequence repeats the ac- 
tions of the first. It’s purpose in this example is to 
demonstrate that an arbitration window may not 
open between locked sequences If they follow one 
another with no idle or non-locked cycles between 
them. 


MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP Is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 3) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# Is sampled on all MCLK edges where 
MSEL# is Inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT # sampled on 
the prior MCLK is used for the next transfer. 

The memory data output enable signal (MDOE#) 
must be inactive to allow the data pins to be used as 
inputs for the first locked read cycle. The MBC as- 
serts MEOC# (clock 4) to latch MZBT # for the next 
transfer, and end the current locked read cycle on 
the memory bus (MBRDY# is not necessary since 
this example shows a single transfer cycle). MZBT # 
is driven high by the MBC in order to force the read 
cycle to begin with the correct burst address. 


Locked sequences always end in a write cycle, no For the locked write cycle, MDOE# Is asserted by 

new CPU initiated cycles may be inserted between the MBC (clock 5) to drive the memory data outputs, 

the Locked read and Locked write cycles. Therefore, 
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MEOC# is again asserted (clock 6) to latch MZBT # 
for the next transfer, and end the current locked 
write cycle on the memory bus (MBRDY# is not 
necessary since this is a single transfer cycle). 
MZBT# is again driven high. MFRZ# is also sam- 
pled during write cycles when MEOC# is sampled 
active by the 82495XP. 

MDOE# is deasserted by the MBC (clock 7) to allow 
the data pins to be used as inputs for the second 
locked read cycle. MEOC# is again asserted (clock 
8) to latch MZBT # for the next transfer, and end the 
locked read cycle on the memory bus. MZBT# is 
again driven high. 

MDOE# is asserted by the MBC (clock 9) to drive 
the memory data outputs for the second locked write 
cycle. MBRDY# is asserted (clock 13) to cause the 
memory burst counter to be Incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers. The MBC drives MEOC# active and 
MSEL# inactive (clock 14) to end the locked write 
cycle on the memory bus and switch memory cycle 
buffers for a new cycle, MZBT # and MFRZ# for the 
next transfer are sampled at this time. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 1) to allow sampling of 
MxSTB and to latch MZBT # for the first locked read 
transfer (on the falling edge of MSEL#). 

The memory data output enable signal (MDOE#) 
must be inactive to allow the data pins to be used as 
inputs for the first locked read cycle. The MBC as- 
serts MEOC# (clock 3) to latch MZBT # for the next 
transfer (on MEOC# falling edge while MSEL# is 
active), and end the current locked read cycle on the 
memory bus (MISTB is not necessary since this ex- 
ample shows a single transfer cycle). MZBT# is 
driven high by the MBC in order to force the read 
cycle to begin with the correct burst address. 

For the locked write cycle, MDOE# is asserted by 
the MBC (clock 4) to drive the memory data outputs. 
MEOC# is again asserted (clock 6) to latch MZBT # 
for the next transfer, and end the current locked 
write cycle on the memory bus (MOSTB is not nec- 
essary since this is a single transfer cycle). MZBT # 
is again driven high. MFRZ# is also sampled on the 
falling edge of MEOC#. 

MDOE# Is deasserted by the MBC (clock 7) to allow 
the data pins to be used as inputs for the second 
locked read cycle. MEOC# is again asserted (clock 
8) to latch MZBT # for the next transfer, and end the 
locked read cycle on the memory bus. MZBT# is 
again driven high. 


MDOE# is asserted by the MBC (clock 9) to drive 
the memory data outputs for the second locked write 
cycle. MOSTB is toggled (clock 12) to cause the 
memory burst counter to be Incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers. The MBC drives MEOC# active and 
MSEL# inactive (clock 13) to end the locked write 
cycle on the memory bus and switch memory cycle 
buffers for a new cycle. MZBT # and MFRZ# for the 
next transfer are sampled at this time. 


9.0 TESTABILITY 

Testing the 82495XP/82490XP chipset can be divid- 
ed into three categories: Built-In Self Test (BIST), 
Boundary Scan, and external testing. BIST performs 
basic device testing on the 82495XP. Boundary 
Scan provides additional test hooks that conform to 
the IEEE Standard Test Access Port and Boundary 
Scan Architecture (IEEE Std. 1149.1). Additional 
testing can be performed by using software written 
to test the 82490XP cache SRAM. 


9.1 Built-In Self Test (BIST) 

BIST tests the internal funcitonality of the 82495XP. 
The 82495XP’s BIST tests approximately 90% of 
the cache controller. It tests the tag RAM and com- 
parators. 

The 82495XP BIST is initiated by driving 
SLFTST#(CRDY#) low and HIGHZ#(MBALE) high 
at least 10 clocks before RESET goes inactive. The 
82495XP Cache Controller reports the result of BIST 
on the CAHOLD signal. When the self test com- 
pletes, the 82495XP drives FSIOUT # inactive and 
the BIST result on CAHOLD. If CAHOLD is driven 
active the BIST successfully passed. If CAHOLD is 
driven inactive, BIST detected a flaw in the cache 
controller. CAHOLD is valid for one clock after 
FSIOUT# deactivation and should be sampled on 
the rising edge of FSIOUT#. 

On the 82495XP, BIST only informs the system that 
a failure did or did not occur. BIST is not able to 
indicate where a failure occurred. After completing 
BIST the cache controller perform reset and begin 
normal operation. 


9.2 Boundary Scan 

The 82495XP/82490XP chipset provides additional 
test ability features compatible with the IEEE Stan- 
dard Test Access Port and Boundary Scan Architec- 
ture (IEEE Std.1 149.1). The test logic provided al- 
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lows for testing to insure that components function 
correctly, that interconnections between various 
components are correct, and that various compo- 
nents interact correctly on the printed circuit board. 

The boundary scan test logic consists of a boundary 
scan register and support logic that are accessed 
through a test access port (TAP). The TAP provides 
a simple serial interface that makes it possible to 
test all signal traces with only a few probes. 

The TAP can be controlled via a bus master. The 
bus master can be either automatic test equipment 
or a component (PLD) that interfaces to the four-pin 
test bus. 


9.2.1 BOUNDARY SCAN ARCHITECTURE 

The boundary scan test logic contains the following 
elements; 

— Test access port (TAP), consisting of input pins 
TMS, TCK, and TDI; and ouput pin TDO. 

— TAP controller, which interprets the inputs on the 
test mode select (TMS) line and performs the 
corresponding operation. The operations per- 
formed by the TAP include controlling the in- 
struction and data registers within the compo- 
nent. 

— Instruction register (IR), which accepts instruc- 
tion codes shifted into the test logic on the test 
data input (TDI) pin. The instruction codes are 
used to select the specific test operation to be 
performed or the test data register to be ac- 
cessed. 

— Test data registers: The 82495XP/82490XP 
chipset components each contain three test data 
registers: Bypass register (BPR), Device Identifi- 
cation register (DID), and Boundary Scan regis- 
ter (BSR). 

The instruction and test data registers are separate 
shift-register paths connected in parallel and have a 
common serial data Input and a common serial data 
output connected to the TAP signals, TDI and TDO, 
respectively. 

9.2.2 DATA REGISTERS 

The 82495XP and 82490XP both contain the two 
required test data registers; bypass register and 
boundary scan register. In addition, they also have a 
device identification register. 


Each test data register is serially connected to TDI 
and TDO, with TDI connected to the most significant 
bit and TDO connected to the least significant bit of 
the test data register. Data is shifted one stage (bit 
position within the register) on each rising edge of 
the test clock (TCK). 


9.2.2.1 Bypass Register 

The Bypass Register is a one-bit shift register that 
provides the minimal length path between TDI and 
TDO. This path can be selected when no test opera- 
tion is being performed by the component to allow 
rapid movement of test data to and from other com- 
ponents on the board. While the bypass register is 
selected data is transferred from TDI to TDO without 
inversion. 


9.2.2.2 Boundary Scan Register 

The Boundary Scan Register is a single shift register 
path containing the boundary scan cells that are 
connected to all input and output pins of the 
82495XP/82490XP chipset. Figure 9.1 shows the 
logical structure of the boundary scan register. While 
output cells determine the value of the signal driven 
on the corresponding pin, input cells only capture 
data; they do not affect the normal operation of the 
device. Data is transferred without inversion from 
TDI to TDO through the boundary scan register dur- 
ing scanning. The boundary scan register can be op- 
erated by the EXTEST and SAMPLE instructions. 
The boundary scan register order is described in 
section 9.2.5. 


9.2.2.3 Device Identification Register 

The Device Identification Register contains the man- 
ufacturer’s identification code, part number code, 
and version code in the format shown in Figure 9.2. 
Table 9.1 lists the codes corresponding to the 
82495XP and 82490XP. 


Table 9-1. Device ID Register Values 


Component 

Code 

Version 

Code 

Part 

Number 

Code 

Manufacturer 

identity 

82495XP 
(AO orA1)0Ah 

0495h 

0495h 

OOh 

82495XP (BO) 

OBh 

0495h 

09h 

82490XP 
(AO orAI) 

OOh 

49A0h 

OOh 
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Figure 9-1. Boundary Scan Register Structure 
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Figure 9-2. Device ID Register 


9.2.2.4 Runbist Register 

The Runbist Register is a one bit register used to 
report the results of the 82495XP BIST when It Is 
initiated by the RUNBIST instruction. This register is 
loaded with a “1” prior to invoking the BIST and is 
loaded with “1” upon successful! completion. “0” 
Indicates a failure occurred during BIST. 

NOTE: 

82495XP RUNBIST Is not available in the A-step- 
ping. 


9.2.3 INSTRUCTION REGISTER 

The Instruction Register (IR) allows instructions to 
be serially shifted into the device. The instruction 
selects the particular test to be performed, the test 
data register to be accessed, or both. The instruc- 
tion register is four (4) bits wide. The most significant 
bit is connected to TDI and the least significant bit Is 
connected to TDO. There are no parity bits associat- 
ed with the Instruction register. Upon entering the 
Capture-IR TAP controller state, the Instruction reg- 
ister is loaded with the default instruction “0001”, 
SAMPLE/PRELOAD. Instructions are shifted into 
the instruction register on the rising edge of TCK 
while the TAP controller Is in the Shift-IR state. 


2-362 










82495XP Cache Controller/82490XP Cache RAM 




int^. 


9.2.3. 1 82495XP Boundary Scan Instruction Set 

The 82495XP cache controller supports all three 
mandatory boundary scan instructions (BYPASS, 
SAMPLE/PRELOAD, and EXTEST) along with one 
optional instruction (IDCODE). On the B-Stepping of 
the 82495XP two additional optional instructions will 
be implemented (RUNBIST and TRISTATE). Table 
9.3 lists the 82495XP boundary scan instruction 
codes. The instructions listed as PRIVATE cause 
TOO to become enabled In the Shift-DR state and 
cause “0” to be shifted out of TDO on the rising 
edge of TCK. Execution of the PRIVATE instructions 
will not cause hazardous operation of the 82495XP. 
Note that system tests should not execute instruc- 
tion codes labeled “RESERVED”. These instruc- 
tions can put the component in an undeterminant 
state which can only be cleared by power on reset. 


Table 9-2. 82495XP Boundary Scan 
Instruction Codes 


Instruction Code 

Instruction Name 

0000 

EXTEST 

0001 

SAMPLE 

0010 

IDCODE 

0011 

RESERVED 

0100 

RESERVED 

0101 

RESERVED 

0110 

RESERVED 

0111 

♦RUNBIST 

1000 

♦TRISTATE 

1001 

RESERVED 

1010 

PRIVATE 

1011 

PRIVATE 

1100 

PRIVATE 

1101 

PRIVATE 

1110 

PRIVATE 

1111 

BYPASS 


* RUNBIST and TRISTATE are boundary scan instructions 
that will be implemented in the B-stepping of the 82495XP. 
They are not available on the A-stepping. 


EXTEST The instruction code Is “0000”. The EX- 
TEST instruction allows testing of cir- 
cuitry external to the component pack- 
age, typically board interconnects. It 
does so by driving the values loaded 
Into the 82495XP boundary scan regis- 
ter out on the output pins corresponding 
to each boundary scan cell and cap- 


turing the values on 82495XP input pins 
to be loaded into their corresponding 
boundary scan register locations. I/O 
pins are selected as input or output, de- 
pending on the value loaded Into their 
control setting locations in the boundary 
scan register. Values shifted into Input 
latches in the boundary scan register 
are never used by the internal logic of 
the 82495XP. Note: after using the EX- 
TEST instruction, the 82495XP must be 
reset before normal (non-boundary 
scan) use. 


SAMPLE/ The instruction code is “0001”. The 
PRELOAD SAMPLE/PRELOAD has two functions 
that It performs. When the TAP control- 
ler is in the Capture-DR state, the SAM- 
PLE/PRELOAD Instruction allows a 
“snap-shot” of the normal operation of 
the componeiit without interfering with 
that normal operation. The Instruction 
causes boundary scan register cells as- 
sociated with outputs to sample the val- 
ue being driven by the 82495XP. It caus- 
es the cells associated with inputs to 
sample the value being driven into the 
82495XP. On both outputs and inputs 
the sampling occurs on the rising edge 
of TCK. When the TAP controller Is in 
the Update-DR state, the SAMPLE/ 
PRELOAD Instruction preloads data to 
the device pins to be driven to the board 
by executing the EXTEST Instruction. 
Data Is preloaded to the pins from the 
boundary scan register on the falling 
edge of TCK. 



IDCODE The instruction code is “0010”. The ID- 
CODE Instruction selects the device 
Identification register to be connected to 
TDI and TDO, allowing the devices iden- 
tification code to be shifted out of the 
device on TDO. Note that the device 
Identification register Is not altered by 
data being shifted in on TDI. 


BYPASS The instruction code is “1111 ”. The BY- 
PASS instruction selects the bypass 
register to be connected to TDI and 
TDO, effectively bypassing the test logic 
on the 82495XP by reducing the shift 
length of the device to one bit. Note that 
an open circuit fault In the board level 
test data path will cause the bypass reg- 
ister to be selected following an Instruc- 
tion scan cycle due to the pull-up resis- 
tor on the TDI Input. This has been done 
to prevent any unwanted Interference 
with the proper operation of the system 
logic. 
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RUNBIST The instruction code is “0111”. The 
RUNBIST instruction selects the one (1) 
bit runbist register, loads a value of “0” 
into the runbist register, and connects it 
to TDO. It also initiates the built-in self 
test (BIST) feature of the 82495XP, 
which is able to detect approximately 
90% of the stuck-at faults on the 
82495XP. The 82495XP ac/dc specifi- 
cations for VCC and CLK must be met 
and reset must have been asserted at 
least once prior to executing the 
RUNBIST boundary scan instruction. Af- 
ter loading the RUNBIST instruction 
code in the instruction register, the TAP 
controller must be placed in the Run- 
Test/ Idle state. BIST begins on the first 
rising edge of TCK after entering the 
Run-Test/ Idle state. The TAP controller 
must remain in the Run-Test/ Idle state 
until BIST is completed. It requires 100K 
clock (CLK) cycles to complete BIST 
and report the result to the runbist regis- 
ter. After completing the 100K clock 
(CLK) cycles, the value in the runbist 
register should be shifted out on TDO 
during the Shift-DR state. A value of “1 ” 
being shifted out on TDO indicates BIST 
successfully completed. A value of “0” 
indicates a failure occurred. After exe- 
cuting the RUNBIST instruction, the 
82495XP must be reset prior to normal 
operation. NOTE: This instruction Is not 
available on the A-stepping of the 
82495XP. It will be Implemented in the 
B-stepping. 

TRISTATE The instruction code is “1000”. The 
TRISTATE instruction initiates the tri- 
state output test mode. After loading the 
TRISTATE boundary scan instruction 
into the instruction register, the TAP 
controller must be placed in the Run- 
Test/ Idle state. To terminate the tristate 
output test mode, the 82495XP must be 
reset. NOTE; This instruction is not 
available on the A-stepping of the 
82495XP. It will be implemented in the 
B-stepping. 


9.2.3.2 82490XP Boundary Scan Instruction Set 

The 82490XP cache controller supports all three 
mandatory boundary scan instructions (BYPASS, 
SAMPLE/PRELOAD, and EXTEST) along with one 
optional Instruction (IDCODE). Table 9.4 lists the 
82490XP boundary scan instruction codes. The in- 
structions listed as PRIVATE cause TDO to become 
enabled In the ShIft-DR state and cause “0” to be 


shifted out of TDO on the rising edge of TCK. Execu- 
tion of the PRIVATE Instructions will not cause haz- 
ardous operation of the 82490XP. Note that system 
tests should not execute instruction codes labeled 
“INTEL RESERVED”. These instructions can put 
the component in an undeterminant state which can 
only be cleared by power on reset. 


Table 9-3. 82490XP Boundary Scan 
Instruction Codes 


Instruction Code 

Instruction Name 

0000 

EXTEST 

0001 

SAMPLE 

0010 

IDCODE 

0011 

INTEL RESERVED 

0100 

INTEL RESERVED 

0101 

INTEL RESERVED 

0110 

INTEL RESERVED 

0111 

INTEL RESERVED 

1000 

INTEL RESERVED 

1001 

INTERL RESERVED 

1010 

PRIVATE 

1011 

PRIVATE 

1100 

PRIVATE 

1101 

PRIVATE 

1110 

PRIVATE 

1111 

BYPASS 


EXTEST The Instruction code is “0000”. The EX- 
TEST instruction allows testing of cir- 
cuitry external to the component pack- 
age, typically board interconnects. It 
does so by driving the values loaded 
Into the 82490XP boundary scan regis- 
ter out on the output pins corresponding 
to each boundary scan cell and captur- 
ing the values on 82490XP Input pins to 
be loaded Into their corresponding 
boundary scan register locations. I/O 
pins are selected as input or output, de- 
pending on the value loaded into their 
control setting locations in the boundary 
scan register. Values shifted into input 
latches in the boundary scan register 
are never used by the internal logic of 
the 82490XP. Note: after using the EX- 
TEST instruction, the 82490XP must be 
reset before normal (non-boundary 
scan) use. 
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SAMPLE/ The instruction code is “0001”. The 

PRELOAD SAMPLE/PRELOAD has two functions 
that it performs. When the TAP control- 
ler is in the Capture-DR state, the SAM- 
PLE/PRELOAD instruction allows a 
“snap-shot” of the normal operation of 
the component without interfering with 
that normal operation. The instruction 
causes boundary scan register cells as- 
sociated with outputs to sample the val- 
ue being driven by the 82490XP. It caus- 
es the cells associated with inputs to 
sample the value being driven into the 
82490XP. On both outputs and inputs 
the sampling occurs on the rising edge 
of TCK. When the TAP controller is in 
the Update-DR state, the SAMPLE/ 
PRELOAD Instruction preloads data to 
the device pins to be driven to the board 
by executing the EXTEST instruction. 
Data is preloaded to the pins from the 
boundary scan register on the falling 
edge of TCK. 

IDCODE The instruction code is “0010”. The ID- 
CODE instruction selects the device 
identification register to be connected to 
TDI and TDO, allowing the devices iden- 
tification code to be shifted out of the 
device on TDO. Note that the device 
identification register is not altered by 
data being shifted in on TDI. 

BYPASS The instruction code is “1111 ”. The BY- 
PASS instruction selects the bypass 
register to be connected to TDI and 
TDO, effectively bypassing the test logic 
on the 82490XP by reducing the shift 
length of the device to one bit. Note that 
, an open circuit fault in the board level 
test data path will cause the bypass reg- 
ister to be selected following an Instruc- 
tion scan cycle due to the pull-up resis- 
tor on the TDI input. This has been done 
to prevent any unwanted interference 
with the proper operation of the system 
logic. 


9.2.4 TEST ACCESS PORT (TAP) 

CONTROLLER 

The TAP controller is a synchronous, finite state ma- 
chine. It controls the sequence of operations of the 
test logic. The TAP controller changes state only in 
response to the following events: 

1. A rising edge of TCK 

2. Power-up. 


The value of the test mode state (TMS) input signal 
at a rising edge of TCK controls the sequence of the 
state changes. The state diagram for the TAP con- 
toller is shown in figure 9.3. Test designers must 
consider the operation of the state machine in order 
to design the correct sequence of values to drive on 
TMS. 


9.2.4.1 Test-Logic-Reset State 


In this state, the test logic is disabled so that normal 
operation of the device can continue unhindered. 
This is achieved by initializing the instruction register 
such taht the IDCODE instruction is loaded. No mat- 
ter what the original state of the controller, the con- 
troller enters Test-Logic-Reset state when the TMS 
input is held high (1) for at least five rising edges of 
TCK. The controller remains In this state while TMS 
is high. The TAP controller is also forced to enter 
this state at power-up. 



9.2.4.2 Run-Test/Idle State 

A controller state between scan operations. Once in 
this state, the controller remains in this state as 
long as TMS Is held low. In devices supporting the 
RUNBIST instruction, the BIST is performed during 
this state and the result is reported in the runbist 
register. For instructions not causing functions to ex- 
ecute during this state, no activity occurs in the test 
logic. The instruction register and all test data regis- 
ters retain their previous state. When TMS is high 
and a rising edge is applied to TCK, the controller 
moves to the Select-DR state. 


9.2.4.3 Select-DR-Scan State 

This is a temporary controller state. The test data 
register selected by the current instruction retains its 
previous state. If TMS Is held low and a rising edge 
is applied to TCK when in this state, the controller 
moves Into the Capture-DR state, and a scan se- 
quence for the selected test data register Is initiated. 
If TMS Is held high and a rising edge Is applied to 
TCK, the controller moves to the Select-1 R-Scan 
state. 

The instruction does not change in this state. 
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Figure 9-3. Tap Controller State Diagram 


9.Z4.4 Capture-DR State 9.2.4.5 ShIft-DR State 

In this state, the boundary scan register captures In this controller state, the test data register con- 

input pin data if the current instruction is EXTEST or nected between TDI and TDO as a result of the cur- 

SAMPLE/PRELOAD. The other test data registers, rent instruction, shifts data one stage toward its seri- 

which do not have parallel input, are not changed. al output on each rising edge of TCK. 

The instruction does not change in this state. The instruction does not change in this state. 

When the TAP controller is in this state and a rising When the TAP controller is in this state and a rising 

edge is applied to TCK, the controller enters the edge is applied to TCK, the controller enters the 

Exit1-DR state if TMS is high or the Shift-DR state if Exit1-DR state if TMS is high or remains in the Shift- 

TMS is low. DR state if TMS is low. 
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9.2.4.6 Exit1-DR State 

This is a temporary state. While in this state, if TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-DR state, which termi- 
nates the scanning process. If TMS Is held low and a 
rising edge is applied to TCK, the controller enters 
the Pause-DR state. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

9.2.4.7 Pause-DR State 

The pause state allows the test controller to tempo- 
rarily halt the shifting of data through the test data 
register in the serial path between TDI and TOO. An 
example of using this state could be to allow a tester 
to reload Its pin memory from disk during application 
of a long test sequence. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
Instruction does not change in this state. 

The controller remains in this state as long as TMS 
is low. Whne TMS goes high and a rising edge Is 
applied to TCK, the controller moves to the Exit2-DR 
state. 


9.2.4.8 Exit2-DR State 

This is a temporary state. While in this state. If TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-DR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TCK, the controller enters 
the Shift-DR state. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

9.2.4.9 Update-DR State 

The boundary scan register Is provided with a 
latched parallel output to prevent changes at the 
parallel output while data is shifted in response to 
the EXTEST and SAMPLE/PRELOAD instructions. 
When the TAP controller is in this state and the 
boundary scan register Is selected, data is latched 
onto the parallel output of this register from the shift- 
register path on the falling edge of TCK. The data 
held at the latched parallel output does not change 
other than In this state. 


All shift-register stages in test data register selected 
by the current instruciton retains its previous value 
during this state. The instruction does not change in 
this state. 


9.2.4.10 Select-1 R-Scan State 

This is a temporary controller state. The test data 
register selected by the current Instruction retains its 
previous state. If TMS is held low and a rising edge 
is applied to TCK when in this state, the controller 
moves into the Capture-1 R state, and a scan se- 
quence for the instruction register Is Initiated. If TMS 
is held high and a rising edge is applied to TCK, the 
controller moves to the Test-Logic-Reset state. 

The instruction does not change in this state. 

9.2.4.11 Capture-IR State 

In this controller state the shift register contained in 
the instruction register loads the fixed value “0001” 
on the rising edge of TCK. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

When the controller is in this state and a rising edge 
is applied to TCK, the controller enters the Exit1-IR 
state if TMS is held high, or the Shift-IR state if TMS 
is held low. 


9.2.4.12 Shift-IR State 

In this state the shift register contained in the in- 
struction register Is connected between TDI and 
TDO and shifts data one stage towards Its serial out- 
put on each rising edge of TCK. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
Instruction does not change in this state. 

When the controller Is in this state and a rising edge 
is applied to TCK, the controller enters the Exitl-IR 
state if TMS is held high, or remains In the Shift-IR 
state if TMS is held low. 


9.2.4.13 Exitl-IR State 

This is a temporary state. While in this state, if TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-IR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TCK, the controller enters 
the Pause-1 R state. 


2-367 


82495XP Cache Controller/82490XP Cache RAM 




iny. 


The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

9.2.4.14 Pause-IR State 

The pause state allows the test controller to tempo- 
rarily halt the shifting of data through the instruction 
register. 

The test data register selected by the current instru- 
citon retains Its previous value during this state. The 
instruction does not change in this state. 

The controller remains in this state as long as TMS 
is low. When TMS goes high and a rising edge is 
applied to TCK, the controller moves to the Exit2-IR 
state. 


9.2.4.15 Exit2-IR State 

This is a temporary state. While in this state, if TMS 
Is held high, a rising edge applied to TCK causes the 
controller to enter the Update-IR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TCK, the controller enters 
the Shift-1 R state. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

9.2.4.16 Update-IR State 

The instruction shifted into the instruction register is 
latched onto the parallel output from the shift-regis- 
ter path on the falling edge of TCK. Once the new 
instruction has been latched, it becomes the current 
instruction. 

Test data registers selected by the current instruc- 
tion retain the previous value. 

9.2.5 BOUNDARY SCAN REGISTER CELL 

The boundary scan register for each component 
contains a cell for each pin, as well as cells for con- 
trol of I/O and tristate pins. 

9.2.5. 1 82495XP Boundary Scan Register Cell 

The following is the bit order of the 82495XP bound- 
ary scan register: (from left to right and top to bot- 
tom) 


TDI->MKEN# KWEND# SWEND# BGT# 
CNA# BRDY# RESERVED CRDY# MWBWT# 
DRCTM# MRO# CWAY# FPFLD# SNPCYC# 
SNPBSY# MHITM# MTHIT# CAHOLD FSIOUT# 
PALLC# SNPADS# CADS# CDTS# CWR# 
CDC# CMIO# RDYSRC MCACHE# KLOCK# 
SMLN# NENE# CFA3 CFA2 TAG11 TAG10 TAG9 
TAGS TAG7 TAG6 TAGS TAG4 TAG3 TAG2 TAG1 
TAGO SET10 SET9 SETS SET7 CLK SET6 SETS 
SET4 SETS SET2 SET1 SETO CFA6 CFAS CFA4 
CFA1 CFAO ADS# LEN BLAST# BRDYC1 # 
BRDYC2# CACHE# LOCK# BLE# BOFF# KEN# 
AHOLD WR# MIO# DC# PWT PCD HITM# PCYC 
EADS# NA# INV WBWT# WAY WRARR# 
MCYC# BUS# MAWEA# WBWE# WBA WBTYP 
MCFAO MCFA1 MCFA4 MCFAS MCFA6 MSETO 
MSET1 MSET2 MSET3 MSET4 MSETS MSET6 
MSET7 MSETS MSET9 MSET10 MTAGO MTAG1 
MTAG2 MTAG3 MTAG4 MTAGS MTAG6 MTAG7 
MTAGS MTAG9 MTAG10 MTAG11 MCFA2 MCFAS 
RESET MAOE# MBAOE# SNPCLK SNPSTB# 
EWBE# MPIC# SNPINV FLUSH# SNYC# 
SNPNCA MBALE MALE MACTL OCTL CFA4CTL 
CFASCTL CACTL FPFLDCTL WBWTCTL 
NACTL TDO 

“RESERVED” signals correspond to no connect 
“NC” signals on the 8249SXP. 

EWBE# and MPIC# will be Implemented In the 
8249SXP B-stepping, omit from boundary scan reg- 
ister for A-stepping 8249SXPs. 

All the *CTL cells are control cells that are used to 
select the direction of bidirectional pins or tristate 
output pins. If “1” is loaded into the control 
cell(*CTL), the associated pin(s) are tristated or se- 
lected as input. The following lists the control cells 
and their corresponding pins. 

1. MACTL controls the MSETO- 10, MTAGO- 11, 
and MCFAO-6 pins. 

2. OCTL controls the WAY, WRARR#, MCYC#, 
MAWEA#, BUS#, WBWE#, WBA, WBTYP, INV, 
EADS#, AHOLD, KEN#, BOFF#, BLE#, 
BRDYC2#, BRDYC1#, BLAST#, NENE#, 
SMLN#, KLOCK#, MCACHE#, RDYSRC, 
CMIO#, CDC#, CWR#, CDTS#, CADS#, 
SNPADS#, PALLC#, FSIOUT#, CAHOLD, 
MTHIT#, MHITM#, SNPBSY#, SNPCYC#, 
CWAY, EWBE#, and MPIC# output pins. 

3. CFA4CTL controls the CFA4 pin. 

4. CFASCTL controls the CFAS pin. 

5. CACTL controls the SETO- 10, TAGO-11, 
CFAO-3, and CFAS pins. 

6. FPFLDCTL controls the FPFLD# pin. 

7. WBWTCTL controls the WB/WT# pin. 

8. NACTL controls the NA# pin. 
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9.2.5.2 82490XP Boundary Scan Register Cell 

The following is the bit order of the 82490XP bound- 
ary scan register: (from left to right and top to bot- 
tom) 

TDI-^CDCTL WR# BLAST# BRDYC# 
BRDY# HITM# ADS# BE# AO A1 A2 A3 A4 A5 A6 
A7 A8 A9 A10 A11 A12 A13 A14 A15 MDATA7 
MDATA3 MDATA6 MDATA2 MDATA5 MDATA1 
MDATA4 MDATAO MDCTL MDOE# MZBT# 
MBRDY# MOEC# MFRZ# MSEL# MCLK MOCLK 
RESET PAR# RESERVED BOFF# WBTYP WBA 
WBWE# BUS# MAWEA# MCYC# CRDY# 
WRARR# WAY CDATA4 CDATAO CDATA2 
CDATA5 CDATA6 CDATA1 CDATA3 
CDATA7 TDO 

“RESERVED” signals correspond to no connect 
“NC” signals on the 82490XP. 

All the *CTL cells are control cells that are used to 
select the direction of bidirectional pins or tristate 
output pins. If “1” Is loaded into the control 
cell(*CTL), the associated pln(s) are tristated or se- 
lected as input. The following lists the control cells 
and their corresponding pins. 

1. CDCTL controls the CDATAO-7 pins. 

2. MDCTL controls the MDATAO-7 pins. 

9.2.6 TAP CONTROLLER INITIALIZATION 

The TAP controller is automatically intialized when a 
device Is powered up. In addition, the TAP controller 
can be initialized by applying a high signal level on 
the TMS Input for five TCK periods. 

9.2.7 BOUNDARY SCAN SIGNAL DESCRIPTION 
AND TIMINGS 

The functionality of TDI, TMS, TDO, and TCK are 
described in Chapter 7. The A.C. timing specifica- 
tions for the boundary scan signals are located in 
Chapter 10. 

9.3 Tri-State Output Test Mode 

The 82495XP has the ability to tri-state all of its out- 
puts and bidirectional pins and to disable all pull-ups 
and pull-downs. During tri-state output test mode all 
pins floated during bus hold as well as those which 
are never floated during normal operation are 


tri-stated. When the 82495XP Is In tri-state output 
test mode, external testing can be used to test 
board interconnections. 

On the 82495XP, tri-state output test mode is in- 
voked by driving HIGHZ#(MBALE) and SLFTST#- 
(CRDY#) active to the 82495XP at least 10 clocks 
prior to the deassertion of RESET. Note that 
HIGHZ# has priority over SLFTST#. When both 
HIGHZ# and SLFTST# are driven active the 
82495XP will invoke the tri-state output mode and 
not invoke BIST. 

Once tri-state output test mode Is Invoked, the 
82495XP remains in it until the next RESET. 


9.4 82490XP Cache SRAM Testing 

The 82490XP cache SRAM can be tested using 
standard cache memory testing techniques. Code 
must be written to: 

1. Flush and reset the 82495XP/82490XP/CPU 
cache 

2. Write 1 ’s to every bit of a block of memory equal 
to the cache size 

3. Read the block of memory to fill the cache, tag- 
ging the data as read-only using the MRO# sig- 
nal 

4. Write O’s to every bit in the block of memory 

5. Read the block, the cache hits should be all 1 ’s 

6. Repeat the process, exchanging 0 for 1 and 1 for 
0 

In this example, the code to test the cache must be 
non-cacheable to the 82495XP. Also, the CPU 
cache must be on so that the 82495XP will perform 
line-fills. 


10.0 AC/DC SPECIFICATIONS 


10.1 Background 

The 82495XP has four main interfaces: CPU Bus, 
memory bus controller, memory bus, and 82490XP. 
The memory bus controller is typically Implemented 
with PLD devices. The MBC Interface signal timings 
are, therefore, generated based on available, off- 
the-shelf PLD specs. The memory bus interface was 
specified to suit a generic memory interface which 
works up to CPU frequency. 
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10.2 O.C. Specifications 


Table 10-1. D.C. Specifications 


Vcc = 5V ± 5%, Tease = 0 to + 85X 

Symbol 

Parameter 

Min 

Max 

Unit 

Notes 

V|L 

Input Low Voltage 

”0.3 

+ 0.8 

V 

TTL Level 

V|H 

Input High Voltage 2.0 

2.0 

Vcc + 0.3 

V 

TTL Level 

VoL 

Output Low Voltage 


0.45 

V 

TTL Level (1) 

Vqh 

Output High Voltage 

2.4 


V 

TTL Level (2) 

•cc 

Power Supply Current 


550 

mA 

82495XP @ 50 MHz, (3) 




300 


82490XP @ 50 MHz 

Power 

Power Dissipation 


2.75 

W 

82495XP @ 50 MHz, (4) 




1.50 


82490XP @ 50 MHz 

'Ll 

Input Leakage Current 


±15 

uA 

0 < V|N > Vcc 

Ilo 

Output Leakage Current 


±15 

uA 

0 ^ Vqut ^ Vcc Tristate 

l|L 

Input Leakage Current 


200 

uA 

V|N = 0.45V, (5) 

C|N 

Input Capacitance 


14 

PF 

for 82495XP 




5 


for 82490XP 

Co 

Output Capacitance 


18 

PF 

for 82495XP 




15 


for 82490XP 

C|/o 

I/O Capacitance 


18 

pF 

for 82495XP 




15 


for 82490XP 

CcLK 

CLK Input Capacitance 


14 

pF 

for 82495XP 




5 


for 82490XP 

Ctin 

Test Input Capacitance 


15 

pF 

for 82495XP 




10 


for 82490XP 

CtOUT 

Test Output Capacitance 


15 

pF 

for 82495XP 




10 


for 82490XP 

Ctck 

Test Clock Capacitance 


15 

pF 

for 82495XP 




10 


for 82490XP 


NOTES: 

(1) Parameter measured at 4mA Hoad. 

For MCFA6-FCFA0, MSET10-MSET0, and MTAG11-MTAG0, this parameter is measured at 16 mA Hoad. 

(2) Parameter measured at 1mA Hoad. 

For MCFA6-MCFA0, MSET10-MSET0, and MTAG11-MTAG0, this parameter is measured at 2 mA Hoad. 

(3) Typical Supply current 400mA. 

(4) Typical Power dissipation is 2W. 

(5) This parameter is for input with pullup. 
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10.3 A.C. Specifications 

All TTL timing specs are measured at 1.5V for both “0” and “1” logic level. 


Table 10-2. Clock, Reset, and Configuration 


Vcc = 5V ± 5%, Tease = 0 to + 85 °C 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

to 

CLK, MCLK, MOCLK Frequency 

16.6 

50 

MHz 


lx clock 

t1 

CLK, MCLK. MOCLK Stability 


0.1 

% 



t2 

CLK. MCLK, MOCLK Period 

20 

60 

ns 

10-1 


t3 

CLK, MCLK, MOCLK High Time 

7 


ns 

10-1 

(1) 

t4 

CLK, MCLK, MOCLK Low Time 

7 



10-1 

(1) 

t5 

CLK, MCLK, MOCLK Rise Time 


2 

ns 


(1) 

t6 

CLK, MCLK, MOCLK Fall Time 


2 

ns 


(1) 

t7 

RESET Setup Time 

7 


ns 

10-4 


t8 

RESET Hold Time 

2 


ns 

10-4 


t9 

RESET Duration 

8xt2 

15xt2 


ns 

10-4 , 

for 82495XP, (2) 
for 82490XP 

tio 

All Configurations CFG3-CFG0, 

CPUTYP, SNPMD, PLOCKEN, 

MEMLDRV, 82490XPLDRV, HIGHZ#, 
SLFTST # Setup Time 

10x12 


ns 

10-4 

(3), (4) 

til 

All Configurations CFG3-CFG0, 

CPUTYP, SNPMD, PLOCKEN, 

MEMLDRV, 82490XPLDRV, HIGHZ#, 
SLFTST# Hold Time 

0 


ns 

10-4 

(3). (5) 

t12 

FLUSH#, SYNC# Setup Time 

8 


ns 

10-3 

for 82495XP, (6) 

t13 

FLUSH#, SYNC# Hold Time 

1 


ns 

10-3 

for 82495XP, (7) 

t14 

FLUSH#, SYNC# Duration 

2xt2 


ns 


(8) 

t15 

MOCLK falling edge to MCLK rising edge 

2 





tie 

FERR#,HLDA Valid Delay 

2 


ns 



t17 

FERR#,HLDA Float Delay 



ns 



t18 

HOLD, BOFF# Setup Time 

7 



10-3 


t19 

HOLD, BOFF# Hold Time 

2 


ns 

10-3 



NOTE: 

(1) Rise/Fall, High/Low times measured between 0.8V and 2.0V. 

(2) Power up reset duration should be 1 ms after Vcc and CLK are stable. If configuration inputs with pullups are left floated, 
10 us RESET duration is required. 

(3) Timing is referenced to reset falling edge. 

(4) 8ns setup time is required to guarantee recognition on next clock. 

(5) 1 ns hold time is required to guarantee recognition on next clock. 

(6) To guarantee recognition on next clock. 

(7) Synchronous mode only. 

(8) Asynchronous mode only. To guarantee recognition. 
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Table 10-3. Memory Bus Controller 82495XP/82490XP Interface 


Vcc = 5V ± 5%, Tease = 0 to + 85 X 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

t30 

BRDY#, CRDY#, KWEND#, SWEND#, 
BGT#, CNA#, [WRMRST] Setup Time 

8 


ns 

10-3 

82495XP Only 

t30a 

BRDY#, CRDY# Setup Time 

7 


ns 

10-3 

82490XP Only 

t31 

BRDY#, CRDY#, KWEND#, SWEND#, 
BGT#, CNA#, [WRMRST] Hold Time 

1 


ns 

10-3 

82495XP Only 

t32 

CW/R#, CD/C#, CMI/0#, RDYSRC, 
MCACHE#, KLOCK#, BLE#, PALLC#, 
CAHOLD, eWAY, FSIOUT#, CADS#, 
CDTS#, SNPADS# Valid Delay 

2 

12 

ns 

10-2 


t33 

NENE#, SMLN# Valid Delay 

2 

15 

ns 

10-2 


t34 

MDATA Setup to CLK (clock before 

BRDY# active) 

6 


ns 

10-3 


t35 

MDATA Valid Delay from CLK (CLK from 
CDTS# valid, MDOE# active) 

3 

15 

ns 

10-2 


t36 

MDATA Valid Delay from MDOE# active 


10 

ns 

10-2 


t37 

MDATA Fload Delay from MDOE# inactive 

0 

14 

ns 




Table 10-4. 82495XP Memory Interface 


Vcc = 5V ± 5%, Tease = 0 to + 85 X 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

Aii Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

t50 

SNPCLK Frequency 


50 

MHz 


lx clock (10) 

t51 

SNPCLK Period 

20 


ns 

10-1 

(11) 

t52 

SNPCLK High Time 

8 


ns 

10-1 


t53 

SNPCLK Low Time 

8 


ns 

10-1 


t54 

SNPCLK Rise Time 


2 

ns 


(1) 

t55 

SNPCLK Fall Time 


2 

ns 


(1) 

t56 

MCFA6-MCFA0, MSET10-MSET0, 

MTAG1 1 -MTAGO Valid Delay 

2 

13 

ns 

10-5 

(2), (3) 

t56 

MCFA6-MCFA0, MSET10-MSET0, 

MTAG1 1 - MTAGO Float Delay 

2 

15 

ns 

10-5 

(4) 

t58 

MCFA6-MCFA0, MSET10-MSET0, 

MTAG 11 -MTAGO Valid Delay . 

2 

15 

ns 

10-5 

(5) 
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Table 10-4. 82495XP Memory Interface (Continued) 


Vcc = 5V ± 5%, Tease = Oto +85X 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

Ail Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

teo 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO Valid Delay 

2 

15 

ns 

10-2 

(6), (12) 

t62a 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINVV, SNPNCA, MAOE#, 

MBAOE#, SNPSTB# Setup Time 

8 


ns 

10-3 

(7a) 

t62b 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE# 
Setup Time 

1 


ns 

10-3 

(7b) 

t62c 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE#, 
SNPSTB# Setup Time 

8 


ns 

10-3 

(7c) 

t63a 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE#, 
SNPSTB# Hold Time 

1 


ns 

10-3 

(7a) 

t63b 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE# 
Hold Time 

8 


ns 

10-3 

(7b) 

t63c 

MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE#, 
SNPSTB# Hold Time 

1 


ns 

10-3 

(7c) 

t64 

SNPSTB# Setup Time 

8 


ns 

10-3 

(8) 

t65 

SNPSTB# Hold Time 

1 


ns 

10-3 

(8) 

t66 

SNPSTB# Active/Inactive Time 

8 


ns 

10-3 

(9) 

t67 

MRO#, MKEN#, DRCTM#, MWB/WT# Setup 
Time 

8 


ns 

10-3 


t68 

MRO#, MKEN#, DRCTM#, MWB/WT# Hold 
Time 

1 


ns 

10-3 


t69 

MTHIT#, MHITM#, SNPBSY#, SNPCYC# 

Valid Delay 

2 

13 

ns 

10-2 


t69a 

SNPCYC# Valid Delay 

2 

12 

ns 

10-2 



NOTES: 

(1) Rise/fall times measured between 0.45V and 2.4V 

(2) See capacitive derating curves for loads above the 50pF specification 

(3) Valid delay from MAOE#, MBAOE# going active (low) 

(4) Float delay from MAOE#, MBAOE# going inactive (high) 

(5) Valid delay from MALE or MBALE if both MAOE#, MBAOE# are active 

(6) Valid delay from CLK only if MALE or MBALE, MAOE# and MBAOE# are active 

(7) a. In clocked mode referenced to SNPCLK rising edge 

b. In strobed mode referenced to SNPSTB# falling edge 

c. In synchronous mode, refer to CLK 

(8) Asynchronous clocked mode only. Timings referenced to SNPCLK 

(9) Asynchronous signal. Time to guarantee recognition on next clock 

(10) SNPCLK is only used for the clocked memory bus mode 

(11) t51>t2 

(12) This parameter Is valid either from SNPCLK or CLK 
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Table 10-5. 82490XP Clocked Mode 


Vcc = 5V ± 5%, Tease = 0 to + 85 X 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

Aii Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

t38 

MBRDY#, MSEL#, MEOC# Setup to MCLK 

5 


ns 

10-3 


t39 

MBRDY#, MSEL#, MEOC# Hold from MCLK 

2 


ns 

10-3 


t40 

MZBT#, MFRZ# Setup to MCLK 

5 


ns 

10-3 


t41 

MZBT#, MFRZ# Hold from MCLK 

2 





t42 

MDATA Setup to MCLK 

5 


ns 



t43 

MDATA Hold from MCLK 

3 


ns 

10-3 


t44 

MDATA Valid Delay from MCLK^MBRDY# 

2 

16 

ns 

10-2 


t45 

MDATA Valid Delay from MCLK^MEOC#, MCLK^MSEL# 

2 

20 

ns 

10-2 


t46 

MDATA Valid Delay from MOCLK 

2 

12 

ns 

10-2 



Table 10-6. 82490XP Strobed Mode 


Vcc = 5V ± 5%, Tease = 0 to +85 X 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

t85 

MISTB, MOSTB High Time 

12 


ns 

10-6 


t86 

MISTB, MOSTB Low time 

12 


ns 

10-6 


t87 

MEOC# High time 

8 


ns 

10-6 


t88 

MEOC# Low time 

8 


ns 

10-6 


t89 

MxSTB, MEOC# Rise time 


2 

ns 


(1) 

t90 

MxSTB, MEOC# Fall time 


2 

ns 


(1) 

t91 

MSEL# High time for restart 

8 


ns 

10-6 


t92 

MSEL# Setup before transition on MxSTB 

5 


ns 

10-8 


t93 

MSEL# Hold after transition on MxSTB 

10 


ns 

10-8 


t92 

MSEL# Hold after transition on MEOC# 

2 


ns 

10-8 


t95 

MxSTB transition to/from MEOC# falling transition 

10 


ns 



t96 

MZBT # Setup to MSEL# or MEOC# falling edge 

5 





t97 

MZBT # Hold from MSEL# or MEOC# falling edge 

2 


ns 

10-7 


t98 

MFRZ# Setup to MEOC# falling edge 

5 


ns 

10-7 


t99 

MFRZ# Hold from MEOC# falling edge 

2 


ns 

10-7 


tlOO 

MDATA Setup to MxSTB or MEOC# falling transition 

5 


ns 

10-7 


tioi 

MDATA Hold from MxSTB or MEOC# falling transition 

2 


ns 

10-7 


t102 

MDATA Valid Delay from MxSTB transition 

2 

16 

ns 

10-9 


t103 

MDATA Valid Delay from MEOC# falling transition or 
MSEL# deactivation 

2 

20 

ns 

10-9 



NOTE: 

(1) Rise/Fall times are measured between 0.8V and 2.0V 
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Table 10-7. Test Mode 


Vcc = 5V ± 5%, Tease = Oto +85X 

Maximum CL = 50 pF unless otherwise specified. 

Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 

Symbol 

Parameter 

Min 

Max 

Unit 

Figure 

Notes 

t120 

TCK Frequency 


25 

MHz 


1x clock 

t121 

TCK Period 

40 


ns 


(2) 

t122 

TCK High Time 

10 


ns 


@2.0V 

t123 

TCK Low Time 

10 


ns 


@ 0.8V 

t124 

TCK Rise Time 


4 

ns 


(1) 

t125 

TCK Fall Time 


4 

ns 


(1) 

t126 

TDI, TMS Setup Time 

8 


ns 

10-10 


t127 

TDI, TMS Hold Time 

7 


ns 

10-10 


t128 

TOO Valid Delay 

3 

25 

ns 

10-10 


t129 

TDO Float Delay 






t130 

All Outputs Valid Delay 

3 

25 

ns 

10-10 

(3) 

t131 

All Outputs Float Delay 


36 

ns 

10-10 

(3) 


NOTES: 

(1) Rise/Fall times are measured between 0.8V and 2.0V Rise/Fall times can be relaxed by 1ns per 10ns increase in TCK 
period 

(2) TCK period ^ CLK period 

(3) Parameter measured from TCK 
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CLK ! ^ 

V / 

f— 



O 




tx = t30, 62a, 62c, 64, 67, 
ty = t31, 63a, 63c, 65, 68, 

76, 85 

77, 86 

240956-51 



' Figure 10-3a. Setup and Hold Timings in 

Figure 10-3. Setup and Hold Timings Strobed Snooping Mode 



Figure 10-4. Reset and Configuration Timings 



Figure 10-5. Memory Interface Signals 
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Introduction 

The i860TM 64-bit microprocessor is a general-purpose 
CPU with on-chip integer unit, floating point, memory 
management, caches, and graphics. The i860 micro- 
processor supports 3-D graphics software with the fol- 
lowing functions: 

1. Hidden surface elimination 

2. Distance interpolation , 

3. Intensity interpolation for 3-D shading 

The fzchks (Z-buffer Check) and pst (Pixel Store) in- 
structions expedite hidden surface elimination. Dis- 
tance interpolation is accomplished with faddz (Add 
with Z merge), and intensity interpolation occurs with 
faddp (Add with Pixel Merge). The purpose of this ap- 
plication note is to illustrate the intended use of these 
instructions in a manner independent of any graphics 
environment in which the instructions might be used. It 
is not the purpose of this application note to present the 
most efficient instruction sequences. While the inner 
loop of Example 7 has as few instructions as logically 
possible, the other examples are intended to present 
general concepts, not optimum implementations. Tun- 
ing for maximum performance depends on the specific 
environment. 

This application note assumes familiarity with the 
i860™ 64-bit Microprocessor Programmer's Reference 
Manual (Intel order number 240329); the i860 micro- 
processor instructions for graphics are detailed in sec- 
tion 6.6. 


1.0 3-D RENDERING 

This series of examples are routines that might be used 
at the lowest level of a graphics software system to con- 
vert a machine-independent description of a 3-D image 
into values for the frame buffer of a color video display. 
Typically, higher-level graphics routines represent an 
object as a set of polygons that together roughly de- 
scribe the surfaces of the objects to be displayed. The 
graphics system maintains a database that describes 


these polygons in terms of their colors, properties of 
reflectance or translucence, and the locations in 3-D 
space of their vertices. Due to the roughness of the 
representation, the amount of information in the data- 
base is considerably less than that which must be deliv- 
ered to the video display. A rendering procedure, such 
as Example 7, uses interpolation to derive the detailed 
information needed for each pixel in the graphics frame 
buffer. The rendering procedure also performs pixel-by- 
pixel hidden-surface elimination. 

The focus of this series of examples is Example 7, 
which operates on a segment of a scan line. The seg- 
ment is bounded by two points of given location and 
color: from point {Xf YO, Zl) with color intensities 
Redly Grnly Blul to point {X2y YO, Z2) with color in- 
tensities Red2, Grn2, Blu2. The points and color inten- 
sities are determined by higher-level graphics software. 
The points represent the intersection of the scan line 
with two edges of the projected image of a polygon. For 
a given scan line, the rendering procedure is executed 
once for each polygon that projects onto that scan line. 
The higher-level graphics software is responsible for 
orienting the objects with respect to the viewer, for 
making perspective calculations, for scaling, and for de- 
termining the amount of light that falls on each poly- 
gon vertex. 

The 16-bit pixel format is used, giving ample resolution 
for color shading: 2^ intensity values for red, 2^ intensi- 
ty values for green, and 2^ intensity values for blue. 
Example 1 shows how to set the pixel size. For hidden- 
surface elimination, the Z-buffer (or depth buffer) tech- 
nique is employed, each Z value having a resolution of 
16-bits. 

Because the examples presented here use almost all of 
the registers of the i860 microprocessor, the registers 
are given symbolic names, as defined by Example 2. In 
a real application, it is likely that some of the inputs to 
the rendering procedure would be passed in floating- 
point registers instead of the integer registers employed 
here. The register allocation shown in Example 2 sim- 
plifies the examples by avoiding the need to use any 
register for multiple purposes. 


// SET PIXEL 

SIZE TO 16 



Id.c 

psr. 

Ra 

// Work on psr 

andnoth 

OxOOCO, 

Ra, 

Ra// Clear PS 

orh 

0x0040 , 

Ra, 

Ra// PS = 16-bit pixels 

st.c 

Ra, 

psr 

// 


Example 1. Setting Pixel Size 
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// 

REGISTER DEFINITIONS FOR RENDERING PROCEDURE 


// 

INTEGER 

LOCALS 



Ra 

= 

r4 

// 

Temporary 



Rb 

= 

r5 

// 

Temporary 



Rc 

= 

r6 

// 

Temporary 



Rd 

= 

r7 

// 

Temporary 


// 

INTEGER 

INPUTS 



XI 

= 

rl6 

// 

X coordinate of starting point of line segment in pixels 


dX 

= 

rl7 

// 

Width of scan line segment in number 

of pixels 


ZBP 

= 

rl8 

// 

Z-buffer pointer to the current line 

segment 


Z1 

= 

rl9 

// 

Initial Z value, fixed-point 16.16 format 


mZ 

= 

r20 

// 

Z slope, fixed-point 16.16 format 



FBP 

= 

r21 

// 

Graphics frame buffer pointer to the 

current line segment 


Redl 

= 

r22 

u 

Initial red intensity, fixed-point 6 

.10 format, plus .5 


Grnl 

= 

r23 

n 

Initial green intensity, fixed-point 

6.10 format, plus .5 


Blul 

= 

r24 

n 

Initial blue intensity, fixed-point 

3.10 format, plus .5 


mR 

= 

r25 

// 

Red slope, fixed-point 6.10 format 



mG 

= 

r26 

// 

Green slope, fixed-point 6.10 format 



mB 

= 

r27 

// 

Blue slope, fixed-point 6.10 format 


// 

REAL LOCALS 



aZ 

= 

f2 

// 

Accumulated Z values 



. aZh 

= 

f3 

// 




iZl 

= 

f4 

// 

Z interpolant, coefficient 1.0 



iZlh 

= 

f5 

// 




iZ3 


f6 

// 

Z interpolant, coefficient 3.0 



iZ3h 

= 

f7 

// 




oldz 

= 

f8 

// 

Original values from the Z-buffer 



newz 

= 

flO 

// 

New Z-buffer values 



newzh = 

: fll 

// 




newi 

= 

f 12 

// 

New pixel values 



iR 

= 

fl4 

// 

Red interpolant, coefficient 4.0 



iRh 

= 

fl5 

// 




aR 

= 

fl6 

// 

Accumulated red intensities 



aRh 

= 

fl7 

// 




iG 

= 

fl8 

// 

Green interpolant, coefficient 4.0 



iGh 

= 

fl9 

// 




aG 

= 

f20 

// 

Accumulated green intensities 



aGh 

= 

f21 

// 




iB 

=; 

f22 

// 

Blue interpolant, coefficient 4.0 



iBh 

=, 

f23 

// 




aB 

=: 

f24 

// 

Accumulated blue intensities 



aBh 

= 

f25 

// 




IZmask 

= 

f26 

// left-end Z mask 



IZmaskh = 

f27 

// 



rZmask 

= 

f28 

// right-end Z mask 



rZmaskh = 

f29 

// 



Example 2. Register Assignments 


2-382 





2.0 DISTANCE INTERPOLATION 

To perform hidden surface elimination at each pixel, 
the rendering routine first interpolates the value of Z at 
each pixel. Distance interpolation consists of calculat- 
ing the slope of Z over the given line segment, then 
increasing the Z value of each successive pixel by that 
amount, starting from XL The width of the line seg- 
ment in pixels is . . . 

dX^ X2 - XI 


polygon. Example 7 assumes that dX and mZ have al- 
ready been calculated, and all that remains is to apply 
mZ to successive pixels. Let Z(Xn) be the Z value at 
pixel Xn. Then . . . 

Z{X1) = Z1 

Z{X1 + 1) = Z7 + mZ 
Z{X1 + 2) = Z1 + 2*mZ 


Calculate the reciprocal of dX: + N) Z1 + N*mZ 


RdX = l/dX 

The value of dX is used several times as a divisor. It is 
most efficient to calculate its reciprocal once, then, in- 
stead of dividing by dXy multiply by RdX. The slope of 


mZ = (Z2 - ZiyRdX 

Because each polygon is a plane, the value of mZ is 
constant for all scan lines that intersect the polygon; 
therefore mZ needs to be calculated only once for each 


Z{X1 ^ dX) ^Z1 + dX*mZ = Z{X2) 

Figure 1 illustrates this Z-value interpolation. 

The faddz instruction helps to perform the above calcu- 
lations 64 bits at a time. Because a Z value is 16 bits 
wide. Example 7 operates on the Z buffer in groups of 
four. The faddz instruction, however, treats the interpo- 
lation values (7V*mZ) as 32-bit fixed-point numbers; 
therefore, two faddz instructions are executed for each 
group of four pixels. Because of the way the faddz shifts 



Figure 1. Z-Buffer interpolation 
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the MERGE register, the first faddz corresponds to 
even-numbered pixels, while the second corresponds to 
odd-numbered pixels. Instead of starting with the value 
for the first pixel {Z{X1)) and adding mZ to each pixel 
to produce the value for the next pixel, the example 
procedure starts with the values for the first two even- 
numbered pixels and adds \*mZ to each of these values 
to produce the values for the adjacent odd-numbered 
pair. Adding 3*mZ to each of the Z values of an odd- 
numbered pair produces the values for the next even- 


numbered pair. Figure 2 shows one way of constructing 
the operands before starting the distance interpolations. 
(The initial value given to srcl depends on the align- 
ment of the first pixel.) Table 1 helps to visualize the 
process. 

After two faddz instructions, the MERGE register 
holds the Z values for four adjacent pixels (in the cor- 
rect order). The form instruction copies MERGE into 
one of the 64-bit floating-point registers. 



Figure 2. faddz Operands 
Table 1. faddz Visualization 


Operands 

63-32 

31-0 

MERGE Register 

63-48 

47-32 

31-16 

15-0 

src1 

-1.0 

-3.0 


src2 

3.0 

3.0 


rdest/src1 

2.0 

0.0 

2 


0 


src2 

1.0 

1.0 


rdest/src1 

3.0 

1.0 

3 

2 

1 

0 

src2 

3.0 

3.0 


rdest/src1 

6.0 

4.0 

6 


4 


src2 

1.0 

1.0 


rdest/src1 

7.0 

5.0 

7 

6 

5 

4 

src2 

3.0 

3.0 



10.0 

8.0 

10 


8 


src2 

1.0 

1.0 



11.0 

9.0 

11 

10 

9 

8 

src2 

3.0 

3.0 



14.0 

12.0 

14 


12 


src2 

1.0 

1.0 


rdest 

15.0 

11.0 

15 

14 

13 

12 


Because the values of Z1 and mZ are constant for each loop through the rendering routine, the numbers shown here are 
the values of the coefficient N, where the actual operands have the values Z1 + N*mZ. For each execution of faddz, src1 
is the same as rdest of the prior faddz. After every two faddz Instructions, a form instruction empties the MERGE register. 
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// CONSTRUCT 

INTERPOLANTS iZl 

AND 

iZ3 GIVEN 

mZ 





ixfr 

mZ, 

iZl 


// Join 

each 

half 

in 

64-bit 

register 

shl 

1, 

mZ, 

Ra 

// Ra = 

2*mZ 





adds 

Ra, 

mZ, 

Ra 

// Ra = 

3*mZ 





ixfr 

Ra, 

iZ3 


// Join 

each 

half 

in 

64-bit 

register 

fmov, ss 

iZl, 

iZlh 


// Join 

each 

half 

in 

64-bit 

register 

fmov.ss 

iZ3, 

iZ3h 


// Join 

each 

half 

in 

64-bit 

register 


Example 3. Construction of Z Interpolants 




The same register is used as both srcl and rdest in all 
faddz instructions. This register serves to accumulate Z 
values for successive pixels; therefore, it is called an 
accumulator. The registers used as src2 are called inter- 
polants. The code in Example 3 constructs the interpo- 
lants; it needs to be executed only once for each poly- 
gon. 


3.0 COLOR INTERPOLATION 

To determine the RGB color intensities at each pixel, 
the rendering routine interpolates between the color in- 
tensities at the end points. (This rendering technique is 
called “Gouraud shading” after H. Gouraud, “Contin- 
uous Shading of Curved Sufaces,” IEEE Transactions 
on Computers^ C-20(6), June 1971, pp. 623-628.) Let 
the symbol C (color) represent either R (red), G 
(green), or B (blue). Color interpolation consists of cal- 
culating the slope of C over the given line segment, then 
increasing the C values of each successive pixel by that 
amount, starting from the values for XL This must be 
done for C = R, C = G, and C = B. The slope of C is . . . 

mC = (C2 - CiyRdX 

. . . where RdX = \/dX 


The value of mC is constant for all scan lines that inter- 
sect a given pair of polygon edges; therefore mC needs 
to be calculated only once for each such pair. Example 
7 assumes that mC has already been calculated for all 
colors, and all that remains is to apply mC to successive 
pixels. Let C(Xn) be a C value at pixel Xn. Then . . . 

C{X1) = Cl 

C(X1 + 1) = C/ + mC 
C{X1 + 2) = Cl + 2*mC 


C{X1 + N) = Cl + N*mC 


C(X1 + dX) = Cl + dX^mC = C{X2) 

Figure 3 illustrates Gouraud shading of a triangle. 

The faddp instruction performs the above calculations 
64 bits at a time. Because a pixel is 16 bits wide. Exam- 
ple 7 operates on pixels in groups of four. Instead of 
starting with the value for the first pixel {C{X1)) and 
adding mC to each pixel to produce the value for the 
next pixel, the example procedure starts with the values 
for the first four pixels and adds A*mC to each group of 
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four to produce the values for the next four. Three 
faddp instructions are executed for each group of four 
pixels. The first increments the blue values; the second, 
green; the third, red. Figure 4 shows one way of con- 
structing the operands for each color before starting the 
color interpolations. (The initial value given to srcl de- 
pends on the alignment of the first pixel.) 

Setup of the accumulator and interpolants is similar to 
that of the Z-buffer. The code in Example 4 constructs 
the interpolants; it needs to be executed only once for 
each pair of edges in each polygon. 


4.0 BOUNDARY CONDITIONS 

The i860 microprocessor operates on 64-bit quantities 
that are aligned on 8-byte boundaries. The code in this 
example takes full advantage of this design, handling 
four 16-bit pixels in each loop. However, if the first or 


last pixel of a line segment is not on an 8-byte bounda- 
ry, two kinds of special considerations are required: 

1. Masking of Z values near the end points. 

2. Initialization of the accumulators. 

4.1 Z-Buffer Masking 

When either the first or last pixel of the line segment is 
not at an 8-byte boundary, the rendering procedure 
must mask the first or last set of new Z-buffer values 
(newz) so that the Z-buffer and the frame buffer are not 
erroneously updated. Sometimes both the first and last 
pixels are in the same 4-pixel set, in which case either 
one may not be on an 8-byte boundary. A function that 
looks up and calculates masks is shown in Example 5. 

Because the value OxFFFF is used for masking, the Z- 
buffer is initialized with OxFFFE, so that the fzchks 
instruction always finds the mask to be greater than 
any Z-buffer contents. 


Accumulator 


63 


47 


31 


15 


0 

1 

C1+3*mCl 

I 

frac 

I 

C1+2*mc! 

I 

frac 

Cl +mC 

frac 

Cl 

frac 


63 


47 

Interpolant 

31 


15 


0 

I 

4*mC I 

I 

frac 

I 

4*mC I 

■ I 

frac 

4*mC 

1 

frac 

1 

4*mC 

frac 


Figure 4. faddp Operands 


// CONSTRUCT 

INTERPOLANTS iR, 

iG, iB GIVEN mR, mG, mB 

shl 

18, 

mR, 

Ra 

n 

Multiply each color slope by four, then 

shl 

18, 

mG, 

Rb 

II 

shift by 16 to put the significant 

shl 

18, 

mB, 

Rc 

// 

bits into the high-order half 

shr 

16, 

Ra, 

mR 

// 

Return significant 16 bits 

shr 

16, 

Rb, 

mG 

// 

to low-order half. Any sign bits 

shr 

16, 

Rc, 

mB 

// 

in high-order half are gone. 

or 

mR, 

Ra, 

Ra 

// 

Join 16-bit quarters 

or 

rG, 

Rb, 

Rb 

// 

in 32-bit register 

or 

mB, 

Rc, 

Rc 

// 


ixfr 

Ra, 

iR 


// 

Join 32-bit halves 

ixfr 

Rb, 

iG 


// 

in 64-bit register 

ixfr 

Rc , 

iB 


// 


fmov.ss 

iR, 

iRh 


// 


fmov.ss 

iG, 

iGh 


// 


fmov.ss 

iB, 

iBh 


// 
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.macro zmask l_align, r_align, Rx, Ry 



// 

l_aligE 

, realign - left- and right- 

end alignment [0..3] in 2-byte 

units 

// 

Rx, Ry 

- scratch registers 



.data 






.align 

8 




left_mask; 

; //low high 





.long 

0x00000000, 0x00000000 

// 

0 mod 4 



.long 

OxOOOOFFFF, 0x00000000 

// 

1 mod 4 



.long 

OxFFFFFFFF, 0x00000000 

// 

2 mod 4 



.long 

OxFFFFFFFF, OxOOOOFFFF 

// 

3 mod 4 


right»mask 

;;//low high 





.long 

OxFFFFOOOO, OxFFFFFFFF 

// 

0 mod 4 



.long 

0x00000000, OxFFFFFFFF 

// 

1 mod 4 



.long 

0x00000000, OxFFFFOOOO 

// 

2 mod 4 



.long 

0x00000000, 0x00000000 

// 

3 mod 4 



.text 






shl 

3, l_align, l_align 

// 

Multiply by 8 



mov 

left_mask, Rx 

// 




fld.d 

l_align (Rx) , IZmask 

// 

Load 8-byte mask 



shl 

3, realign, r_align 

// 

Multiply by 8 



mov 

right_mask, Rx 

// 




fld.d 

realign (Rx) , rZmask 

// 

Load 8-byte mask 


// 

If the 

first and last pixels are 

contained in the same 64-bit 


// 

aligned 

set, then IZmask = IZmask OR rZmask. 



andh 

0x8000, dX, rO 

// 

Is dX negative 



be 

L2 

// 

If not, right end is in other 

set 


fxfr 

IZmask, Rx 

// 




fxfr 

rZmask, Ry 

// 




or 

Rx , Ry , Rx 

// 

OR low-order half 



ixfr 

Rx, IZmask 

// 




fxfr 

IZmaskh, Rx 

// 




fxfr 

rZmaskh, Ry 

// 




or 

Rx , Ry , Rx 

// 

OR high-order half 



ixfr 

Rx, IZmaskh 

// 



L2: 

: nop 


// 




.endm 
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Table 2. Accumulator Initial Values 


Alignment 

Initial Z Accumulator Values 

0 


Z1 - 

rmz 


Z1 - 

3*mZ 


2 


Z1 - 

2*r(\Z 


Z1 - 

4*mZ 


4 


Z1 - 

3*mZ 


Z1 - 

5*mZ 


6 


Z1 - 

4*mZ 


Z1 - 

6“^mZ 


Alignment 

Initial Color Accumulator Values 

C = R, G, B 

0 

C1 

- rmC 

C1 - 2*mC 


C1 - 3*mC 

C1 

- 4*mC 

2 

C1 

- 2*mC 

C1 - 3*mC 


C1 - 4*mC 

C1 

- 5*mC 

4 

C1 

- 3*mC 

C1 - 4*mC 


C1 - 5*mC 

C1 

- 6*mC 

6 

C1 

- 4*mC 

C1 - 5*mC 


Cl - e’mc 

C1 

- 7*mC 


Table 3. Accumulator Initialization Table 


Alignment 

Table Values 

*mZ 

♦mR 

"mG 

*mB 

0 

-1, -3 

-1, -2, -3, -4 

I 

I 

ro 

I 

CO 

I 

I 

I 

I 

CO 

I 

2 

-2. -4 

-2, -3, -4, -5 

-2. -3, -4, -5 

-2, -3, -4, -5 

4 

-3, -5 

-3, -4, -5, -6 

-3, -4, -5, -6 

CO 

I 

lO 

I 

I 

CO 

I 

6 

CO 

I 

I 

-4, -5, -6, -7 

-4, -5, -6, -7 

-4, -5, -6, -7 


4.2 Accumulator Initialization 

When the first pixel of the line segment is not at an 8- 
byte boundary, initial values placed in the accumulators 
(flZ, aB, aGy and aR) must be selected so that Zi, 
Redly Grnly and Blul correspond to the correct pixel. 
The desired result is that shown by Table 2. However, 
each value is a composite of two terms: one that is 
constant for each edge pair Xn*mZ, n*mRy n*mGy 
n *mB) and one that can vary with each scan line (Z/, 
Redly Grnly Blul). The example assumes that the con- 
stant values have all been calculated and stored in a 
memory table of the format shown by Table 3. At the 
beginning of each line segment the values appropriate 
to the alignment of the line segment are retrieved from 
the table and added to the initial Z and color values, as 
shown in Example 6. 

5.0 THE INNER LOOP 

Once the proper preparations have been made, only a 
minimal amount of code is needed to render each scan- 


line segment of a polygon. The code shown in Example 
7 operates on four pixels in each loop. The left and 
right ends of the line segment go through different logic 
paths so that the Z-buffer masks can be applied by the 
form instruction. All the interior points are handled by 
the tight inner loop. 

The controlling variable dX is zero-relative and is ex- 
pressed as a number of pixels. The value of dX also 
indicates alignment of the end-points with respect to 
the 4-pixel groups. Unaligned left-end pixels are sub- 
tracted from dX before entering the inner loop; there- 
fore, subsequent values of dX indicate the alignment of 
the right end. A value that is 3 mod 4 indicates that the 
right end is aligned, which explains the test for a value 
of — 5 near the end of the loop ( — 5 mod 4 = 3). The 
fact that the value —5 is loaded into register Rb on 
every execution of the loop does not represent a pro- 
gramming inefficiency, because there is nothing else for 
the core unit to do at that point anyway. 
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// ACCUMULATOR 

INITIALIZATION 

TABLE 




.data; .align .double 





acc_init_tab : : 

.double [16] 0 





.dsect 






aBl : .double 

// Four initial 

16 

-bit 

blue values 

aGl : .double 

// Four initial 

16 

-bit 

green values 

aRi : .double 

II Four initial 

16 

-bit 

red values 

aZl: .double 

// Two initial 32- 

bit 

Z values 

.end 






.text 






// INITIALIZE ACCUMULATORS 





.macro acc.init 

Lalign, Rtab, 

Rx, Ry, 

Fx, 

Fxh 

// Lalign - left-end alignment (0. 

.3) 

in 

two-byte units 

// Rtab - register to use for addressing the table 

// Rx, Ry, Fx, 

Fxh - scratch 

registers 


mov 

acc«init«tab. 

Rtab 


// 


shl. 

5, Lalign, 

Lalign 

// 

Multiply by row width 

adds 

Lalign, Rtab, 

Rtab 


// 

Index row corresponding to alignment 

fld.d 

aZi (Rtab) , 

aZ 


// 

Z 

ixfr 

Zl, Fx 



// 

Z 

fld.d 

aRi (Rtab) , 

aR 


// 

R-Load constant values 

shl 

16, Redl, 

Rx 


// 

R-Shift starting value to hi-order 

fmov.ss 

Fx , Fxh 



// 

Z 

shr 

16 , Rx , 

Ry 


// 

R-Redl stripped of sign bits 

fiadd.dd 

Fx , aZ , 

aZ 


// 

Z 

or 

Rx, Ry, 

Ry 


// 

R-Form (Redl, Redl) 

ixfr 

Ry, Fx 



// 

R-Put in 64-bit register 

fld.d 

aGi (Rtab) , 

aG 


// 

G 

shl 

16, Grnl, 

Rx 


// 

G 

fmov.ss 

Fx , Fxh 



// 

R-Form (Redl, Redl, Redl, Redl) 

shr 

16, Rx, 

Ry 


// 

G 

fiadd.dd 

Fx , aR , 

aR 


// 

R-Add variables to constants 

or 

Rx, Ry, 

Ry 


// 

G 

ixfr 

Ry, Fx 



// 

G 

fld.d 

aBi (Rtab) , 

aB 


// 

B 

shl 

16, Blul, 

Rx 


n 

B 

fmov.ss 

Fx , Fxh 



// 

G 

shr 

16 , Rx , 

Ry 


// 

B 

fiadd.dd 

Fx, aG, 

aG 


// 

6 

or 

Rx, ' Ry, 

Ry 


// 

B 

ixfr 

Ry, Fx 



// 

B 

fmov.ss 

Fx , Fxh 



// 

B 

fiadd.dd 

Fx , aB , 

aB 


// 

B 

.endm 
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// 

RENDERING 

PROCEDURE 





// 

16-bit pixels, 

16-bit 

Z-buffer 



and 

3, 

XI, 

Ra 

// 

Determine alignment of starting-point 


acc_init 

Ra, Rb, Re, Rd, 

Fa, Fah 

// Initialize accumulators 


subs 

4, 

Ra, 

Rb 

// 

4 - alignment 


subs 

dX, 

Rb, 

dX 

// Adjust dX by XI alignment 


// If dX 

<= 0, then right 

end is in same set as left end 


and 

3 y 

dX, 

Rb 

// 

Determine alignment of right end 


zmask 

Ra, Rb, 

Re , Rd 


// 

Prepare both left- and right-end masks 

left_end;: // Handle boundary 

conditions 


d.faddz 

aZ, 

iZ3, 

aZ 

// 

Interpolate 2 even Z values 


adds 

-8, 

EBP, 

FBP // 

Anticipate autoincrement 


d.faddz 

aZ, 

iZl, 

aZ 

// 

Interpolate 2 odd Z values 


adds 

-8, 

ZBP, 

ZBP 

// 

Anticipate autoincrement 


d.form 

IZmask, 

newz 


// 

Mask 4 new Z values 


fld.d 

8(ZBP) , 

oldz 


// 

Fetch 4 old Z values 


d.faddp 

aB, 

iB, 

aB 

// 

Interpolate 4 blue intensities 


mov 

"4, 

Ra 


// 

Loop increment ; 4 pixels 


d.faddp 

aG, 

iG, 

aG 

// 

Interpolate 4 green intensities 


adds 

-4, 

dX, 

dX 

// 

Prepare dX for bla at end of loop 


d.faddp 

aR, 

iR, 

aR 

// 

Interpolate 4 red intensities 


bla 

Ra, 

dX, 

LI 

// 

Initialize LCC 


d. form 

fO, 

newi 


u 

Move 4 new pixels to 64-bit reg 


adds 

5, 

dX, 

rO 

'it 

Are there any whole sets (dX < -5) ? 

LI: 

d.fzchks 

oldz. 

newz. 

newz// 

Mark closer points in PM[7..4] 


be 

Short_segment 


// 

Get out now if no whole set 


d.fnop 




// 



fld.d 

16(ZBP) 

, oldz 

// 

Fetch 4 old Z values 

inner_loop:; 

// Handle 

all interior points 


d. faddz 

aZ , 

iZ3, 

aZ 

// 

Interpolate 2 even Z values 


nop 




// 



d. faddz 

aZ, 

iZl, 

aZ 

// 

Interpolate 2 odd Z values 


fst .d 

newz. 

8(ZBP)++ 

// 

Update Z buf from prior loop 


d.form 

fO, 

newz 


// 

Move 4 new Z values to 64-bit reg 


nop 




// 



d.fzchks 

fo. 

fO, 

fO 

// 

Shift PM[7..4] to PM[3..0] 


mov 

-5, 

Rb 


// 

-5 mod 4=3, aligned right end 


d.faddp 

aB, 

iB, 

aB 

// 

Interpolate 4 blue intensities 


pst .d 

newi , 

8(FBP)++ 

// 

Store pixels indicated by PM[3..0] 


d.faddp 

aG, 

iG, 

aG 

n 

Interpolate 4 green intensities 


xor 

Rb, 

dX, 

rO 

// 

Are we at an aligned right end? 


d.faddp 

aR, 

iR, 

aR 

// 

Interpolate 4 red intensities 


be 

aligned 

_end 


// 

Taken if at an aligned right end 


d.form 

fo. 

newi 


// 

Move 4 new pixels to 64-bit reg 


bla 

Ra, dX, 

inner« 

loop 

// 

Loop if not at end of line segment 


d.fzchks 

oldz. 

newz. 

newz// 

Mark closer points in PM[7..4] 


fld.d 

16(ZBP) 

, oldz 

// 

Fetch 4 old Z values for next loop 

// 

End of inner«loop. 

Right end not aligned 


Example 7. 3-D Rendering (1 of 2) 
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right_end:; // 

Handle boundary 

conditions I 

d.faddz 

aZ f 

iZ3, 

aZ 

// 

Interpolate 2 even Z values 

nop 




// 


d.faddz 

aZ f 

iZl, 

aZ 

// 

Interpolate 2 odd Z values 

f St .d 

newz. 

8(ZBP)++ 

// 

Update Z buf from prior loop 

d.form 

rZmask, 

newz 


// 

Mask 4 new Z values 

nop 




// 


d.fzchks 

fO, 

fo. 

fO 

// 

Shift PM[7..4] to PM[3..0] 

nop 




// 


d. faddp 

aB, 

iB, 

aB 

// 

Interpolate 4 blue intensities 

pst .d 

newi, 

8(FBP)++ 

// 

Store pixels indicated by PM[3..0] 

d. faddp 

a6, 

iG, 

aG 

// 

Interpolate 4 green intensities 

nop 




// 


d. faddp 

aR, 

iR, 

aR 

// 

Interpolate 4 red intensities 

nop 




// 


aligned.end; : 

// No special boundary conditions 

d.form 

fO, 

newi 


// 

Move 4 new pixels to 64-bit reg 

br 

wrap-up 



// 


d.fzchks 

oldz. 

newz. 

newz// 

Mark closer points in PM[7..4] 

nop 




// 


short_segment : 

• 





d.fnop 




// 


adds 

8, 

dX, 

rO 

// 

Is right end in same set as left? 

d.fnop 




// 


bnc.t 

right_end 


// 

Branch taken if no. 

d.fnop 




// 


fld.d 

16(ZBP) 

, oldz 

// 

Fetch 4 old Z values 

wrap_up:: // Store the 

unstored 

and 

leave dual mode. 

fzchks 

fO, 

fo. 

fO 

// 

Shift PM[7..4] to PM[3..0] 

fst .d 

newz. 

8(ZBP)++ 

// Update Z buf from prior loop 

fnop 






pst .d 

newi. 

8(FBP)++ 

// 

Store pixels indicated by PM[3..0] 


Example 7. 3-D Rendering (2 of 2) 
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6.0 ALTERNATIVE IMPLEMENTATIONS 

Example 8 contrasts the inner loop of the 16-bit pixel rendering procedure with that of an 8-bit procedure. For 8-bit 
pixels, two faddp instructions accomplish 64-bits of pixel intensity interpolation; there is no need to maintain three 
separate color accumulators. Four faddz instructions (rather than two) are required, because eight Z values are 
created for the eight pixels per loop. 


// 8-bit Pixels, 16-Bit Zbuffer = 8 Pixels in 15 Clocks 

U G-Unit 1 

Core Unit 

inner_loop;; 


d. faddz aZ,deltaZl,aZ 

fld.q 16(ZBP) ,oldZ_A 

d. faddz aZ,deltaZ2,aZ 

nop 

d.form fO,newZ_A 

nop 

d, faddz aZ,deltaZl,aZ 

andh 0x8000, dX, rO 

d.faddzz aZ,deltaZ2,aZ 

bnc rightend 

d.form fO,newZ_B 

nop 

d.fzchks oldZ_A,newZ_A,newZ_A 

nop 

d.fzchks oldZ_B,newZ_B,newZ_B 

nop 

d. faddp intens,dl,intens 

fst.q newZ^A ,16(ZBP)++ 

d. faddp intens,dI2,intens 

bte 0,dX,end 

d.form f0,newi 

bla neg8,dX,inner_loop 

d.fnop 

pst.d newi ,8 (FBP) ++ 

/ / 


/ / 


// 16-Bit Pixels, 16-Bit Zbuffer = 4 Pixels in 10 Clocks 

// G-Unit 1 

Core Unit 

inner_loop : ; 


d. faddz aZ,iz3,aZ 

nop 

d. faddz aZ,izl,aZ 

fst.d newz,8 (ZBP) ++ 

d.form f0,newz 

nop 

d.fzchks f0,f0,f0 

mov -5,Rb 

d. faddp aB,iB,aB 

pst.d newi ,8 (FBP) ++ 

d. faddp aG,iG,aG 

xor Rb,dX,r0 

d . faddp aR , iR , aR 

be aligned_end 

d.form f0,newi 

bla neg4,dX,inner_loop 

d.fzchks oldz,newz,newz 

fld.d 16(ZBP) ,oldz 

/ / 


/ / 1 


Example 8. Inner Loop of Renderers for Two Pixel Sizes 
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ABSTRACT 

The i860 Processor computes floating-point results rap- 
idly, lending itself to DSP (digital signal processing) as 
well as general-purpose computing. With this high per- 
formance, DSP functions can be added to any system 
containing an i860 CPU. A Fast Fourier Transform 
(FFT) illustrates this DSP power. Complete code for 
the FFT is presented in this application note, as well as 
performance measurements. Both complex and real in- 
put data FFTs are included, as well as both Decimation 
in Time and Decimation in Frequency. 


1.0 INTRODUCTION TO FAST 
FOURIER TRANSFORMS 

Discrete Fourier Transforms (DFTs) change time-do- 
main data samples into a frequency-domain profile of 
the sampled signal. The frequency-domain representa- 
tion consists of the magnitudes of sine waves at various 
frequencies, which would recreate the original data if 
superimposed. To accomplish the transform, a DFT 
adds combinations of the input data samples, after mul- 
tiplying some of those inputs with weighting factors. 
The number of samples, “N”, is usually a power of two. 

Each result in the frequency domain comes from a 
weighted sum of all data samples. The weighting (“W”) 
factors are called “twiddles”, and are complex cosine/ 
sine values for each particular frequency. 

The FFT (Fast Fourier Transform) is an efficient im- 
plementation of the DFT, defined by: 

x(n) = time domain samples of the signal, 
n = 0, 1, . . . N-1 

X(k) = the Discrete Fourier Transform of x(n), k = 
0,1, ...N-1 

= a “frequency domain” equivalent of x(n) 

= S x(n) * Wnk n = 0 to N-1, and 
Wnk = e-jlTTnk/N ^ where j = 1 

= 2 x(n) * (cos(27rnk/N) - j ♦ sin(27rnk/N)) 

The (N-1) complex adds and (N-1) complex multiplica- 
tions required for each X(k) make the DFT an Order 
(N2) computation. Fortunately, the FFT decomposes 
this to an Order (N * log 2 N) algorithm by splitting the 
N-sum into units of 2-sums. These units are called 
“butterflies” because they produce 2 output values 
from 2 inputs, with the butterfly-shaped dataflow 
shown below. (Some FFT algorithms, called Radix-4, 
use 4-input, 4-output butterflies.) The butterfly calcula- 
tions are executed in stages, with log 2 N stages and N/2 
butterflies per stage. 


The subdivision, or decimation, of the N-sum into but- 
terflies can be done via two different methods: “Deci- 
mation in Time” (DIT) or “Decimation in Frequency” 
(DIF). The methods differ in the ordering of twiddles 
and the form of the butterfly arithmetic, but they yield 
the same answer. They are based on different mathe- 
matical derivations of the FFT: DIT results from recur- 
sively splitting the input time-domain samples into an 
even-indexed group and an odd-indexed, while DIF 
comes from splitting the DFT output frequency-do- 
main points into odd/even groups. 

2.0 BUTTERFLY DEFINED 

Let A = the first input to the butterfly (complex 
number, composed of Real part AR and 
Imaginary part AI) 

B = the second input to the butterfly (com- 
plex, BR and BI) 

W = twiddle factor (also complex, WR and 
WI) 

Anew = complex result #1, which overwrites A 
Bnew = result #2, which overwrites B 

For a “Decimation-in-Frequency” butterfly. 

Anew = A + B 
Bnew = (A - B) ♦ W 

The complex add, subtract, and multiply of a butterfly 
decompose into 4 real multiplies, 3 real adds, and 3 real 
subtracts: 

AnewR = AR + BR tempR = AR-BR 
Anewl = AI + BI tempi = AI-BI 

BnewR = (tempR * WR) - (tempi * WI) 

Bnewl = (tempR * WI) + (tempi * WR) 

For a “Decimation-in-Time” butterfly. 

Anew = A + (B ? W) 

Bnew = A - (B * W) 

The number of real operations remains 4 multiplies and 
6 add/subtracts, but the equations differ and the multi- 
plies must be done first: 

tempR = (WR * BR) - (WI ♦ BI) 
tempi = (WR ♦ BI) + (WI * BR) 

AnewR = AR + tempR BnewR = AR-tempR 
Anewl = AI + tempi Bnewl = Al-tempI 
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Butterfly Dataflow: 



The stages, twiddles, and butterflies for 8-point FFTs 
are shown in Figures 1 and 2. For larger values of N, 
the dataflow patterns are very similar, with N/2 butter- 
fli'es executed at each stage, and a greater number of 


stages. Refer to a text on Digital Signal Processing for a 
complete discussion of FFT design, such as chapter 6 of 
Theory and Application of Digital Signal Processing (see 
the Bibliography at the end of this note). 


Figure 1. Decimation-In-Frequency FFT for 8 points 




Figure 2. Decimation-In-Time FFT for 8 points 
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3.0 BIT REVERSAL 

Due to their structure, FFT algorithms have the side- 
effect of scrambling the ordering of output data. For 
radix-2 FFTs, the output is in “bit-reversed” order — 
for example, the value for frequency one is NOT at 
location one in the output array, but at location N/2. 
Time to unscramble the output is often NOT included 
in FFT benchmarking, because scrambled output is fine 
for some signal-processing uses such as convolution. In 
any event, unscrambling consists of swapping the loca- 
tions of pairs of output values. Alternatively, input val- 
ues can be shuffled, as Decimation in Time usually does 
before the first stage (as shown in Figure 2), Otherwise, 
to avoid the shuffling of input in DIT, the twiddles 
must be accessed in bit-reversed order. As an example 
of bit-reversal, for 256 points the reordering involves: 

SWAP X(i) and X(j), where i = ’klmnopqr’b and j = 
’rqponmlk’b. The second index (j) contains the same 
bits as (i), but in opposite order. 


4.0 FFT IMPLEMENTATION ON THE 
i860 CPU 

Several features of the i860 CPU contribute to FFT 
performance. The floating-point multiplier and adder 
can simultaneously produce 1 product and 1 sum per 
cycle, using Dual-Operation FP instructions. To fetch 
the butterfly inputs and store outputs, Dual-Instruc- 
tion-Mode allows a memory fetch or store simultaneous 
with the multiply and add. Four floating-point numbers 
can be stored by one instruction, using the 16-byte-op- 
erand “fst.q” instruction. Likewise, 16 bytes can be 
fetched from the data cache in one fld.q op. 

The floating-point arithmetic of the i860 CPU con- 
forms to IEEE 754 format, which some DSPs fail to do. 
Shown below is code for the crucial inner loop of the 
FFT: 


// 




//inner^loop : 

do 2 Decimation-In-Frequency FFT butterflies. 

// Twelve clocks for 2 butterflies - 12 

FP add/sub, 8 multiplies. 

// 6 8-byte 

loads, 4 8-byte stores. 


// FP-op 


Core-op 

inner_loop ; : 




d.r2pt .ss 

WR,DI,BnewR 

pfld.d 

wind (wstart) ,WRo 

d.pfsub.ss 

AR,BR,AnewRo 

fld.d 

8 ( fetch) ++, ARo 

d.ratls2. ss 

AI ,BI ,AnewIo 

fld.d 

offset (fetch) ,BRo 

d.i2st.ss 

WI,DR,BnewI 

fst.q 

AnewR, 16 (store )-!■+ 

d.ratlp2. ss 

AR,BR,DR 

adds 

wincr , wind , wind 

d.ialp2.ss 

AI,BI,DI 

pfld.d 

wind (wstart) ,WR 

// 




d.r2pt . ss 

WRo ,DI ,BnewRo 

adds 

wincr , wind, wind 

d,pf sub. ss 

ARo ,BRo jAnewR 

fld.d 

8 (fetch)++,AR 

d.ratls2.ss 

AIo ,BIo jAnewI 

fld.d 

offset (fetch) ,BR 

d.i2st.ss 

WIo ,DR,BnewIo 

fst.q 

BnewR, offset (store) 

d.ratlp2. ss 

ARo , BRo , DR 

bla 

decrem, count , inner_loop 

d.ialp2.ss 

AIo ,BIo ,DI 

and 

wlimit , wind, wind //modulo. 

// — • 
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5.0 CODE DESIGN 

Refer to the inner loop above and code listings at the 

end of this application note for the discussions that fol- 
low. Refer to the 64-bit Microprocessor Pro- 

grammer's Reference ManuaP' (Intel order number 
240329) for details on instructions and formats. 

The programs include both assembly and Fortran com- 
ponents. Input data can number any power of 2 from 
16 to 1024 points. The algorithms are radix-2, floating- 
point, in-place. Included in the listing are both Decima- 
tion-in-Time and Frequency, and both complex-input 
and real-input FFTs. 


5.1 Cache Utilization 

Because the instruction cache contains 4-Kbytes, all re- 
quired code easily fits in cache. However, a 1024-point 
complex FFT fills the 8-Kbyte data cache with the in- 
put X() array. Thus the more rarely-used twiddle W() 
array is intentionally kept out of cache, as described in 
the “pfld” section. 

A subroutine (“fetch.ss”) is used to move the input data 
array efficiently into cache for the 1024-point FFT. 
“Fetch” allows all data to be brought into cache using 
the next-near (NENE#) accesses to DRAM. Without 
that routine, getting A and B from locations separated 
by 4 Kbytes (NOT the same DRAM page) makes 
fetches and writebacks from DRAM for the first stage 
slower, and adds 30% to overall execution time. 

For larger FFTs (2048 points = 16 kB), straightfor- 
ward expansion of the present algorithm would cause 
increased cache misses. Thus a larger FFT should be 
broken into multiple FFTs of 1024 points so that all 10 
stages of each can achieve high cache hits. The algo- 
rithm becomes (assuming 2048 points, Decimation-In- 
Time): 

1) Bit-reverse the entire input array 

2) Do a 10-stage FFT on the second set of 1024 points. 
Cache hits should be high on those, since they were 
most recently accessed by the bit-reversal. 

3) Do a 10-stage FFT on the first 1024 points. Prefetch 
before the first stage to ensure cache hits. 

4) Combine the 2 separate 1024-point results with a fi- 
nal stage of butterflies, where A is offset from B by 
8 Kbytes. 


5.2 Pfld 

Twiddle factors (W) are fetched with pfld (Pipelined 
Floating-Point Load), to avoid caching them. Only in 
the first stage are all the W() elements used; successive 
stages use fewer and fewer elements, which are separat- 
ed by larger and larger strides. Thus placing W() in 
cache would be inefficient. The streaming of W() from 
main memory actually yields better performance than 
caching W(), for 512 and 1024 points. With the i860 
CPU’s 8-byte external data bus, a complex W() value 
can be transferred in a single bus cycle. Some FFT rou- 
tines calculate W() on the fly, rather than fetching pre- 
calculated values; however, performance decreases due 
to the added run-time calculations. 


5.3 Fst.q 

Quad-word (16-byte) stores allow 4 floating-point regis- 
ter values to update the cache in one cycle. Likewise, 
fld.q (Quad Floating Point Load) transfers 4 values to 
the registers in a cycle. However, in some FFT stages, 
double-word fetches (fld.d) are used instead of fld.q; 
that allows the “background” fetch of a set of operands 
concurrent with arithmetic on the other set. For the 
same reason, the inner loop does 2 butterflies, rather 
than one. 


5.4 Bit Reversal Code 

The code for bit-reversal fetches the indices of 2 ele- 
ments to be swapped from a pre-allocated array of indi- 
ces, and swaps the data elements. Again, pfld.d keeps 
the indices out of cache, for the 1024 point case. That 
assembly version of bit-reversal is approximately 7 
times faster than the standard Fortran routine. The ar- 
ray of indices was generated by printing out the values 
generated during operation of the standard Fortran ver- 
sion; similarly, the twiddle W() values can be pre-allo- 
cated and generated using a high-level- language pro- 
gram. 

6.0 PIPELINE SCHEDULING 

The adder pipeline is 3 stages, as is the multiplier; for 
the calculation of 

BnewR = (AR - BR) * WR - (Al - Bl) * Wl 

the adder result is fed back into the multiplier, and the 
product again feeds into the adder. The adder and mul- 
tiplier pipes each advance one stage for each floating- 
point instruction issued. 
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The butterfly decomposes into 6 real add/subtracts and 
4 real multiplies. Thus the best possible performance 
would be 6 clocks per butterfly, with the multiplies to- 
tally overlapping the adds. The overlap is accomplished 
with the Dual-Operation instructions: 

r2pt (KR*src2, Treg + Mout, load KR srcl) 

ratls2 (KR*Aout, srcl-src2, load T •«— Mout) 
i2st (KI*src2, Treg-Mout, load KI <— srcl) 
ratlp2 (KR*Aout, srcl +src2, load T •«— Mout) 
ialp2 (KI*Aout, srcl +src2, load KI srcl) 

KR, KI, and T are operand registers feeding the multi- 
plier and adder, separate from the floating-point regis- 
ter file. They permit the 4 inputs for multiply and add, 
even thought the instruction format holds only 2 regis- 
ters. “Aout” and “Mout” are adder and multiplier out- 
puts. 

The data path arrangements of some of these ops are 
illustrated in Figures 3 and 4. Fetching and storing of 
butterfly operands is overlapped with the calculations, 
using Dual Instruction Mode — the integer core op 
(such as a load or branch) and FP op are fetched simul- 
taneously from the instruction cache and executed 
simultaneously. 

Scheduling of instructions was done with a pipeline dia- 
gram, as illustrated in the comments of the code listing 



Figure 3. Datapath for r2pt op 


of difstep.ss in the Appendix. (The comments show the 
machine state after the instruction, is processed.) Begin 
by placing the desired results in the rightmost column, 
then tracing progress backwards through the adder. 
When adder inputs are products (of the multiplier), one 
product is kept in the Treg for a cycle while the other 
propogates through the multiplier final stage. Those 
products can be traced back on the multiplier pipeline, 
to determine at what instruction the multiplier inputs 
must be provided. 

For example, place the BnewR label in the “Write” 
stage of the pipe (the output of the Adder). Now 

BnewR = WR " DR - Wr Dl 

Three instructions earlier, the adder inputs for BnewR 
must be fed to adder; those inputs are products, one of 
which comes directly from the multiplier output, and 
the other from the Treg. The multiplier output and 
Treg value must then be traced back through multiplier 
stages, requiring the following instructions: 

i2st.ss WIo,DR,BnewIo as the 10th op of 12, to start (T — Mout) 
ratls2.ss AIo,BIo,AnewI as the 9th instruction, to update the Treg 
ialp2.ss AI,BI,DI as the 6th op, to multiply DI * WI 

ratlp2.ss AR,BR,DR as the 5th op, to multiply DR * WR 

ratls2.ss AI,BI,AnewIo as the 3rd, to start DI into the adder 

pfsub.ss AR,BR,AnewRo as the 2nd, to start DR into the adder 



Figure 4. Datapath for rat1p2 op 
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Some trial-and-error ordering of the desired outputs is 
needed to devise a sequence which keeps the adder 
pipeline full. An op is chosen for each slot for its ability 
to load the KR or KI register, or to initiate an adder 
operation simultaneous with the multiplies required to 
calculate BnewR and Bnewl. 

Handy hints to assist dual-operation scheduling in- 
clude: 

1) Feedback the adder result to the multiplier, or visa 
versa, whenever possible. For example, the ratlp2 
op feeds adder-out to multiplier. Thus both srcl and 
src2 fields of the instruction are available to feed the 
adder-in, and a simultaneous useful add and multi- 
ply are initiated. 

2) Freeze one of the pipes, by using a pfadd or pfmul, 
when appropriate. In the butterfly, where 6 adds are 

• done for every 4 multiplies, freezing of the multipli- 
er does not degrade performance. The freeze allows 
multiplier results to be held until needed in the ad- 
der. 

3) The Treg can hold a multiplier result for several 
cycles until needed in the adder. 

4) Unroll a loop to do 2 iterations per loop. That pro- 
vides time to fetch inputs for iteration 2 while calcu- 
lating iteration 1, and store results of iteration 1 
(and fetch more inputs) while calculting iteration 2. 


7.0 PERFORMANCE MEASUREMENTS 

The code was run on an evaluation card with DRAM 
memory only, no external cache, 33.33 MHz clock, and 
5 wait-states or more for some accesses. Next-near ac- 
cesses (address falls into the same DRAM page as the 
previous access) are zero wait-state, but far accesses 
take 5 or more wait-states. The code was run under a 
virtual-memory multitasking executive. Shown below 
are measured results: 

System: 33.3 MHz 80860 with a single bank of 
static-column DRAM 

Algorithm: Radix-2 FFT, in-place. Data is IEEE 754 
single-precision floating point. Implemented in assem- 
bly-language and Fortran code. 


Type of FFT 

Time 

Time 

(including 

bit-reversal) 

1024-point-complex, DIF 

1.17ms 

1.33 ms 

1 024-point-real 


0.67 ms 

512-polnt-complex, DIF 

0.48 ms 

0.56 ms 

51 2-polnt-real 


0.33 ms 

256-polnt-complex, DIF 

0.22 ms 

0.26 ms 

1 024-polnt-complex, DIT 


1 .37 ms 

51 2-point-complex, DIT 


0.59 ms 


7.1 Cache Fill and Writeback Time 

Measured times do not include cache-fill and write- 
back. That is, the timings measured 200,000 executions 
of the FFT using the same input array. (Performance 
figures offered by other manufacturers for DSP chips 
likewise assume that the data is already in on:chip 
RAM. Of course, the i860 CPU will do that fetching 
automatically into its data cache.) The additional time 
for cache fill and writeback were measured as: 

1024-point-complex 0.25 ms (8 Kbytes fetched, 

8 Kbytes writeback) 

512-point-complex 0.12 ms (4 Kbytes) 

To quantify the calculations in MFlops (Millions of 
FLoating-point Operations per Second), consider that 
the 1024-point complex FFT is implemented with 
about 16,400 multiplies and 28,700 adds/subtracts. 
Thus the 1.17 ms translates to a sustained 38.5 MFlops 
rate. For 512 points, the required 20,000 Flops means 
41.6 MFlops. 

The overall FFT is about 10 times faster than the equiv- 
alent Fortran. Inner loop performance was measured at 
13 cycles for the 24 instructions, which is 6.5 cycles per 
butterfly. 
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8.0 CODE HIERARCHY 

Pictured below are the programs developed for the i860 CPU FFT: 


fft.f dirr.f 


bitrev.ss 

I 

fetch 


difstep.ss 


240658-6 


The Fortran program ffttestf is the highest-level pro- 
gram of those listed on the following pages. It calls two 
FFT subroutines, diff.f and then compares their 
outputs. Fft.f is a Fortran decimation-in-time algo- 
rithm, while is the high-speed DIF routine. Diff.f 
is callable by C or Fortran applications. It in turn calls 
difstep, which is implemented in assembly code 
(difstep.ss). Difstep is called once per stage of the FFT. 
A Fortran version (difstepf.f) is shown, for comparison. 
Other assembly routines are the bit-reversal-data-move- 
ment (bitrev.ss) and prefetch (“fetch” inside bitrev.ss). 

Difstep.ss contains approximately 225 assembly in- 
structions, and bitrev.ss contains about 24. The Fortran 
diff.f compiles to about 80 instructions. 


9.0 CONCLUSION 

The i860 CPU computes very Fast Fourier Transforms, 
quicker than most high-end dedicated DSP chips. Con- 
tributing to the FFT performance are the 8-kByte on- 
chip data cache and 4-kByte instruction cache. Also the 
8-byte external data bus, pfld instruction, and 16-byte 
data cache width provide sufficient bandwidth to keep 
the arithmetic units busy. Dual-Operation instructions 
and Dual-Instruction-Mode allow parallel data move- 
ment and calculations. The 33.3 MHz clock rate allows 
both an add and a multiply every 30 ns, giving a time of 
1.17 ms for a 1024-point complex FFT. A 40 MHz i860 
Microprocessor will yield a time of less than 1 mSec. 



A Decimation-in-Time version of diff.f and difstep.ss 
can be found in ditt.f and difstep.ss. The DIT version 
performs 5-10% slower than the Decimation-in-Fre- 
quency because the DIT loop takes 7 cycles per butter- 
fly, while DIF takes 6. 

A real-input algorithm is dirr.f, which can be called 
and tested using program real.f. Dirr.f calls difstep to 
do a complex DIF FFT on N real data points, but 
treats them as N/2 complex points. Then realfix.ss is 
called by dirr.f to fix the DIF output, compensating for 
the treatment of the N real points as N/2 complex. The 
derivation of the real-fix can be found in reference 3, 
Numerical Recipes in C. 

The mixture of Fortran, C, and assembly code is ac- 
complished by passing function inputs and outputs in 
registers. Only pointers and integer values were used in 
the above code, but floating point parameters can also 
be exchanged. A calling program feeds arguments to a 
function in r 16, r 17, and higher-numbered integer reg- 
isters. The callee is permitted to destroy the contents of 
those registers, but rl:rl5 must be preserved. For more 
details on parameter-passing conventions see the i860 
64-bit Microprocessor Programmer's Reference Manual^ 
Chapter 8. 
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3. Press, William, Flannery, Brian, et. al.. Numerical 
Recipes in C, 1988, Cambridge University Press. 
Pages 398-424. 

[Numerical Recipes contains the C-code source for 
“realfix”] 
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APPENDIX A 
PROGRAM LISTINGS 


pg- 

A-2 1) diff.f: 

Fortran module to do fast Decimation-In-Frequency (DIF) Radix-2 FFT. 

A-3 2) difstep.ss: 

Assembly code which does all DIF FFT butterflies; called by diff.f. 

A-11 3) difstepf.f: 

Fortran equivalent of difstep.ss. Included here for clarity. 

A- 13 4) bitrev.ss: 

Assembly code to do bit-reversal. 

A-17 5) ffttest.f: 

Highest-level Fortran code. Tests diff.f or ditt.f. 

A-21 6) ditt.f: 

Fortran module to do fast Decimation-In-Time (DIT) Radix-2 FFT. 

A-22 7) ditstep.ss: 

Assembly code which does all DIT FFT butterflies; called by ditt.f. 

A-30 8) dirr.f: 

Fortran module for Real-Input Decimation-In-Frequency (DIF) Radix-2 FFT. 

A-31 9) realflx.ss: 

Assembly code required by dirr.f to compensate for Real-Input. 

A-36 10) real.f: 

Highest-level Fortran code, for Real-value input. Tests dirr.f. 

A-40 11) fft.f: 

Fortran FFT algorithm. Generates “correct” answers for comparison against the other code. 

A-43 12) makefile: 

Unix V/386 version of a makefile to maintain the FFT code, using the Unix “make” program-mainte- 
nance utility. Note that this makefile uses the Unix macro preprocessor “m4” to convert symbolic names 
to register numbers. 

A-45 13) start.ss: 

Assembly code preamble for Fortran runtime. 

A-45 14) time.c: 

Dummy routine, used to install breakpoints. 


2-402 



AP-435 




iny. 


c 

C File; diff.f 

C FFT - Decimation in Freq, radix-2, inplace, 1-dimen 

C Intel assumes no responsibility for use or misuse of this code. 

C 5/19/89; call fetch8() added for 1024-point caching. 

C 6/01/89; fetchO CRUCIAL-30% performance loss if removed 

C Inputs; 

C A= complex array of input, up to 1024 pts, single-prec float 

C M= log of number of pts 

C = (number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 
C W= complex array of twiddle factors, length N/2. 

C REV= 0 if bitreversed output ok. l=must re-order output 
C 

C Outputs; 

C A= complex fft of input A 
C 

subroutine diff (a,m,N,W,REV) 
integer m,N, i, j,k, REV,wlimit 

integer offset, stage, groups, wincr,powers2(0 ;10) 
complex a(n) ,w{N/2) ,temp 

data powers2 /I, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024/ 

C Powers2 to avoid calls to POW, DIV 

C Twiddle factor array w{k) has (cos, -sin) of 2pi*k/N 

CC Assume the caller provides w(k) constants ALREADY initialized 

C 

C Pre-touch data, lock into cache, for 8kByte fft; 

IF (N .gt. 513) THEN 

call fetch(a,%VAL(n) ) 

ENDIF 

C 

wlimit = 8* ((N/2) - 1) 

C "DO 20" stage-loop 
DO 20 stage = l,m 

groups = powers2 ( stage-1) 

C groups=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 

C offset gets N/2,N/4,N/8,N/16, . . . 

offset = powers2(m-stage) 
wincr = groups 

call difstep ( a, w, groups, off set , wincr , wlimit) 

20 CONTINUE 

IF (REV .ne. 0) THEN 

CC REV .ne. 0 means must do bit-reversal reordering of output 
call bitrev(a,%VAL(M) ,n) 

ENDIF 

RETURN 

END 

C 
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// 

// difstep.ss; do one stage of fft butterflies 

// DIF = Decimation in Frequency, radix-2, inplace, 1-dimension 

// (C) Copyright 1989 INTEL Corporation. 

// Inner loop developed with assistance from Tricord Systems, Inc. 

// 

// 5/18/89; 1 pm - offset_2 added, as next-to-last stage was slow 
// 5/19/89; 4 pm - fetch8() routine added, for cache miss avoidance. 

// 5/31/89; am - use fst.q (13% perf improvement of inner„loop !) 

// last«bfly added, for performance. 

// 6/02/89; am - bptr deleted. Modulo-address W (5% perf improved) 

// 

// Intel is not responsible for use nor for misuse of this program. 

// 

// Do one entire stage {n/2 butterflies). Sample invocation; 

// call difstep(a,w,groups,offset ,wincr,wlimit) 
//====================:=======:=====:===========:==============^^^ 

// Inputs; 

// A= complex array of input, single-prec float 
// (complex stored as 4byte real, 4byte imag contiguously) 

// W= pointer to array of twiddle factors. Assuming W(k) is 
// CMPLX(cos(2pi*k/N) ) ,-sin(2pi*k/N) ) for k=0 to (N/2)-l. 

// offset = distance (except for scale-by-8byte sizeof (complex) ) between 

// the 2 input values for each butterfly. 

// Offset also is the number of butterflies done per "group". 

// groups = N/ (2*offset) . The number of sub-DFTs this stage is split into. 
// wincr = distance (except for scale-by-8byte sizeof (complex) ) between 
// successive w values for successive butterflies 

// wlimit =max index, in bytes, of W table. 

// 

// Outputs; 

// A= complex radix-2 butterflied version of input. 

// 

define (astart, rl6) //input data base address 

define (wstart ,rl7) //twiddle array ptr. Because w-contents depend on N, 

// we will assume the caller has initialized w() array. 

define (groups, rl8) //groups=number of sub-DFTs this stage is split into. 

define (offset ,rl9) //offset (initially elements, mult by 8 to get bytes) 

// between node and its dual (the 2 numbers to butterfly, ie. A and B) 
define (wincr, r20) //increment between successive W values. Remains constant 
// within a given stage. For Decimation in Freq, wincr addressing is; 

// +8 for offset=:N/2 (W0,W1,W2,W3, . . .W(n-l) ) 

// +16 offset=N/4 (WO, W2, W4, ... ) etc... 

define (wlimit ,r21) //max index, in bytes, of W table, 
define (wind, r22) //current index, in bytes, of W table, 
define (offset2,r23) //offset*2 

define (decrem,r24) //bla decrement , 
define (somecount ,r25) // bla counter 

define (FEtch, r26) //pointer to 1st component of butterfly (load) 
define (STore ,r27) // " " 1st component of butterfly (store) 
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// f4;f7 spare 
define (AR, fl2) 
define (AI. fl3) 
define (ARo,f 14) 
define (AIo , fl5) 
define (BR, fl6) 
define (BI. fl7) 
define (BRo , fl8) 
define(BIo,fl9) 

define (ER, f20) 
define (El, f21) 
define (ERo,f22) 
define (EIo , f23) 

define (FR, f24) 
define (FI, f25) 
define (FRq,f 26) 
define(FIo,f27) 

define (DR, f28) 
define (DI, f29) 
define (WR, f30) 
define (WI, f31) 
define (WRo,f 10) 
define (WIo,fll) 


//element A, real component 
// •• ", imag 

// extra A value, for prefetch (o="odd") 
//element B, real component 
// extra B value, for prefetch 


//A+B, real (ER = AR + BR) 

// " imag " 

//A+B, real, previous loop’s value 
// " imag " 

//W*(A-B), real 
// " imag " 


//Difference of A-B, real part 
// " ", imag " 

//W (twiddle factor) , real part 
// " " , imag 

//W (twiddle factor) , real part (EXTRA copy) 
// " " , imag 


.text 

.align .quad 
_difstep_ : ; 

Id.l 0 (groups) .groups //fix Fortran call-by-ref 

Id.l O(offset) .offset // 

shl 3, offset .offset // change from elements to bytes 

shl 1, offset ,offset2 

fst.q f8 ,-16(sp)++ //save "local" regs 

fst.q fl2.-16(sp)++ // " " 

adds -1, groups, groups // pre-decrement for bnc usage, or bla usage 

adds -16,r0,decrem //bla decrement 


// We code the last 2 stages as special cases; 

// 

xor 8, offset, rO //offset=l, special case, no complex mult, funny addressing 
bcoffset.l// (ASSUMING of fset=l, means wincr=0, and no twiddle used) 
xor 16, offset, rO //offset=2, special case, no complex mult, funny addressing 
bcoffset_2// (ASSUMING offset=2 means wincr=N/4) 

// 

Id.l 0 (wincr) jwincr 

Id.l 0 (wlimit) .wlimit 
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pfadd.ss fO,fO,fO 
pfadd.ss fO,fO,fO 

pfadd.ss fO,fO,fO // init A1,A2,A3=0 
pfmul.ss fO,fO,fO 
pfmul.ss fO,fO,fO 
pfmul.ss fO,fO,fO 

// 

// init pointers; 

shl 3,wincr,wincr //scale for bytes, 

shl l,wincr,wind //init wind =2*wincr 

pfld.d 0 ( wstart),fO 
pfld.d wincr { wstart),fO 
adds -8 ,astart jFEtch 

pfld.d wind (wstart),fO 

adds wincr , wind, wind //wind now 3*wincr 

// here fetch first set of A,B,W before bla-loop 
pfld.d wind (wstart) ,WR 
adds wincr , wind, wind 

and wlimit , wind, wind //modulo-wlimit the w index 

// We do modulo-addressing on W(), to keep the pfld pipeline full. We 
// never do a W-fetch beyond the end of the table. 

// And the modulo-check needs to be done only every 4th pfld, as always 
// we use a multiple of 4 W{) factors. 


fld.d 8 {FEtch)++,AR 
fld.d offset (FEtch),BR 
d.r2apl.ss fO,fO,fO //clear Treg. 

adds -32, off set , somecount // bla counter (predecrement by 4 elements) 

// 

// Definitions for pipe diagram; 

// (the complex multiply product, F, broken into 4 real mult and 2 adds) ; 

// WR = cos() , WI=:-sin() . 

// DR = AR - BR ; (diffence of Real components of A,B) 

// DI = AI - BI ; (diffence of Imag components) 

// ER = AR + BR ; El = AI + BI ; 

// FR = K - L; where K= WR*DR, L=WI*DI 

// FI = N + M; where M= WI^DR, N=WR*DI 

// For 1st time thru inner_loop, don't have correct values to store. 

// Must do 1 loop before the loop, sans the stores. 

first_bfly;; //fill pipe 

// KR.,.KI...M1....M2....M3 T A1....A2 A3.... Write 

d.r2pt.ss WR,fO,fO // WRO - 
pfld.d wind (wstart) ,WRo 

d.pfsub.ss AR,BR,fO // - - - - DRO 

fld.d 8 (FEtch)++,ARo 

d.ratls2.ss AI,BI,fO // DIO DRO 

fld.d offset (FEtch) ,BRo 

d.i2st.ss WI,fO,fO // WIO - - - - - DIO DRO 

adds wincr, wind, wind 
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d.ratlp2.ss AR,BR,DR // 

KO 




ERO 


DIO 

DRO 

nop 

d.ialp2.ss AI,BI,DI // 

LO 

KO 

_ 


EIO 

ERO 


DIO 

pfld.d wind (wstart) ,WR 
d.r2pt.ss WRo,DI,fO // WRl 

NO 

LO 

KO 

_ 


EIO 

ERO 


fld.d 8 (FEtGh)++,AR 
d.pfsub.ss ARo,BRo,ER // 

NO 

LO 

KO 

. 

DRl 


EIO 

ERO 

fld.d offset (FEtch),BR 
d.ratls2.ss AIo,BIo,EI // 

_ 

NO 

LO 

KO 

Dll 

DRl 

. 

EIO 

adds wincr, wind, wind 

d.i2st,ss WIo,DR,fO // 

WIl MO 

. 

NO 

KO 

K-L 

Dll 

DRl 


and wlimit , wind, wind 









quickstart : ; 

d.ratlp2.ss ARo,BRo,DR // 

K1 

MO 


NO 

ERl 

FRO 

Dll 

DRl 

bla decrem, somecount , inner_loop 

//init LOG 






d.ialp2.ss AIo,BIo,DI // 

LI 

K1 

MO 

NO 

Ell 

ERl 

FRO 

Dll 

adds -16 ,astart , STore 

II ptrs init 16 

low. 

for 

fst.q 

instructions 

// 









// Each butterfly = 1 complx multiply, 1 

complx add, 1 

complx subtract 


// = 4 multiply, 

it 3 add 

// 3 subtract 









// 3 8-byte fetches (A, B, W) 








// 2 8-byte stores (A 

// 

// 6 cycles per butterfly 

// 

, B) 








// inner_loop: iterates "offset/2" times 

(eg. 

N/4 

for 

stage 

1, N/8 for 

stage2) , 

// for each group. It does 2 

butterflies per iteration 




inner_loop ; : 









// KR.. 

.KI...M1.. 

.M2.. 

M3 

T 

Al.. 

A2. . 

A3.. Write 

// 1 

1 1 

1 

1 

1 

1 

1 

1 

1 

d.r2pt.ss WR,DI,FR // WR2 

N1 

LI- 

K1 

NO 

N+M 

Ell 

ERl 

FRO 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss AR,BR,ERo // 

N1 

LI 

K1 

NO 

DR2 

FIO 

Ell 

ERl 

fld.d 8 (FEtch)++,ARo 
d.ratls2.ss AI,BI,EIo // 

_ 

N1 

LI 

K1 

DI2 

DR2 

FIO 

Ell 

fld.d offset (FEtch) ,BRo 
d.i2st.ss WI,DR,FI // 

WI2 Ml 

_ 

N1 

K1 

K-L 

DI2 

DR2 

FIO 

fst.q ER, 16 ( STore) ++ //update 

ER/EI/ERo/EIo 







d.ratlp2.ss AR,BR,DR // 

K2 

Ml 

- 

N1 

ER2 

FRl 

DI2 

DR2 

adds wincr , wind, wind 

d.ialp2.ss AI,BI,DI // 

L2 

K2 

Ml 

N1 

EI2 

ER2 

FRl 

DI2 

//no need for modulo-check ("and") here. 

as odd num of 

W*s have been fetched. 

pfld.d wind (wstart) ,WR 
// 




— 


— 

... 
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//KR.. 

.KI.. 

.Ml... 

.M2. 

...M3 

T 

Al... 

.A2.. 

..A3... 

.Write 

d.r2pt.ss WRo,DI,FRo // WR3 

- 

N2 

L2 

K2 

N1 

N+M 

EI2 

ER2 

FRl 

adds wincr, wind, wind 

d.pfsub.ss ARo,BRo,ER// 


N2 

L2 

K2 

N1 

DR3 

FIl 

EI2 

ER2 

fld.d 8 {FEtch)++,AR 
d.ratls2.ss AIo,BIo,EI// 


. 

N2 

L2 

K2 

DI3 

DR3 

FIl 

EI2 

fld.d offset (FEtch) ,BR 
d.i2st.ss WIo,DR,FIo// 

WI3 

M2 

. 

N2 

K2 

K-L 

DI3 

DR3 

FIl 

fst.q FR, offset (STore) 
//update FR/FI/FRo/FIo 










d.ratlp2.ss ARo,BRo,DR// 


K3 

M2 

- 

N2 

ER3 

FR2 

DI3 

DR3 

bla decrem,somecount , inner^loop 









d.ialp2.ss AIo,BIo,DI// 


L3 

K3 

M2 

N2 

EI3 

ER3 

FR2 

DI3 

and wlimit , wind, wind 

//modulo. 








end_inner»loop ; ; //KEEP Pipelines 
// RE-init pointers for fetches 

full 








d.fiadd.ss fO,fO,fO 










adds offset2,astart ,astart 

//bump to 

next 

group 





//redo A,B fetches, with proper ptr. 







d.fiadd.ss fO,fO,fO 










fld.d O(astart) ,AR //get 

first 

AR/AI 

in 

next group 





d.fiadd.ss fO,fO,fO 
fld.d offset (astart) ,BR 
d.fiadd.ss fO,fO,fO 










adds 0, ast art , FEtch 










last^bfly;; //do final 2 butterflies, start 

next 

group 





// KR.. 

.KI.. 

.Ml... 

.M2. 

...M3 

T 

Al... 

.A2.. 

..A3... 

.Write 

d.r2pt.ss WR,DI,FR // WR4 

- 

N3' 

L3 

K3 

N2 

N+M 

EI3 

ER3 

FR2 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss AR,BR,ERo // 


N3 

L3 

K3 

N2 

DR4 

FI2 

EI3 

ER3 

fld.d 8 (FEtch)++,ARo 
d.ratls2.ss AI,BI,EIo// 


. 

N3 

L3 

K3 

DI4 

DR4 

FI2 

EI3 

fld.d offset (FEtch) ,BRo 
d.i2st.ss WI,DR,FI // 

WI4 

M3 


N3 

K3 

K-L 

DI4 

DR4 

FI2 

fst.q ER, 16 (STore )++ 
d.ratlp2.ss AR,BR,DR // 


K4 

M3 

_ 

N3 

ER4 

FR3 

DI4 

DR4 

adds wincr, wind, wind 

d.ialp2.ss AI,BI,DI // 
pfld.d wind (wstart) ,WR 
// 


L4 

K4 

M3 

N3 

EI4 

ER4 

FR3 

DI4 

// KR.. 

.KI.. 

.Ml... 

.M2. 

...M3 

T 

Al... 

.A2.. 

. . A3 ... 

.Write 

d.r2pt.ss WRo,DI,FRo // WR5 

- 

N4 

L4 

K4 

N3 

N+M 

EI4 

ER4 

FR3 

fld.d 8 (FEtch) ++,AR 
d.pfsub.ss ARo,BRo,ER// 


N4 

L4 

K4 

N3 

DR5 

FI3 

EI4 

ER4 

adds -32, offset , somecount // reset bla 

counter 






d.ratls2.ss AIo,BIo,EI// 


- 

N4 

L4 

K4 

DI5 

DR5 

FI3 

EI4 

adds wincr, wind, wind 

d.i2st.ss WIo,DR,FIo// 
adds -1, groups, groups 
d.fnop 

fld.d offset (FEtch) ,BR 
d.fnop 

WI5 

M4 

- 

N4 

K4 

K-L 

DI5 

DR5 

FI3 

bnc.t quickstart //branch on value of 

groups 






d.fnop 










fst.q FR, offset (STore) 
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end_last_bfly ; ; 
d.fnop 
br endit 

fiadd.ss fO,fO,fO 

fst.q FR, offset (STore) //repeated for bnc.t untaken case 
.align .quad 

offset^l ; ; 

// want FEtch=0,2,4,6,8, . . . elements. ASSUMING wincr=0, 

// and that w=(l,0) , so that no complex mult needed, and NO W will be fetched. 
// E=A+B, F=A-B. (Per double-butterfly loop: 8 pfadd,4 dword fid, 4 fst, 

// 1 bla) (fld.q required, to reduce # fids to avoid pipe stalls) 

// Performance = 4 cyc/bfly best case. 


//Redefine regs 
define {AR3,f 12) 
define (AI3,fl3) 
define (BR3,fl4) 
define(BI3,fl5) 
define (AR4,f 16) 
define (AI4,fl7) 
define (BR4,fl8) 
define(BI4,fl9) 


for fld.q, fst.q usage, when A and B adjacent: 

//element A, real component 
// " ", imag 

//element B, real component 
// extra A value, for prefetch 
// extra A value, for prefetch 


define (ER3, f20) //A+B, real (ER = AR + BR) 
define (EI3, f21) // " imag " 
define{FR3, f22) //(A-B) , real 
define (FI3. f23) // ” imag " 

define {ER4,f 24) //A+B, real, extra copy 
define (EI4,f25) // " imag 

define (FR4,f 26) 
define (FI4,f27) 

adds -16,astart ,FEtch 

fld.q 16 (FEtch)++,AR4 

adds -1, groups, somecount // bla counter (predecremented already by 1) 

//using groups=blacount on the offset.l loop, intentionally, 
adds -16, FEtch, STore 

//startup the loop: 

/ / ——————— / / A1 . .... .A2. .... .A3. .... .Write : 

d.pfadd.ss AR4,BR4,fO // ARn+BRn - - - 

fld.q 16 (FEtch) ++,AR3 
d.pfadd.ss AI4,BI4,fO // Aln+BIn ERn 
adds -2,r0,decrem //2 bflies per loop 

d.pfsub.ss AR4,BR4,fO // ARn-BRn Eln ERn 
bla decrem, somecount , offsetl_loop //init LCC 
d.pfsub.ss AI4,BI4,ER4 // Aln-BIn FRn Eln ERnext 

nop 

// // A1 A2 A3 .Write; 

offsetl_loop ; ; 
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d.pfadd.ss AR3.BR3,EI4 // AR+BR FI- FR- EI- 

nop 

d.pfadd.ss AI3,BI3,FR4 // AI+BI ER FI- FR- 

fld.q 16 (FEtch)++,AR4 

d.pfsub.ss AR3,BR3.FI4 // AR-BR El ER FI- 

fst.q ER4,16(STore)++ 

d.pfsub.ss AI3,BI3,ER3 // AI-BI FR El ER 

nop 

d.pfadd.ss AR4,BR4,EI3 // AR2+BR2 FI FR El 

fld.q 16 (FEtch)++,AR3 

d.pfadd.ss AI4,BI4,FR3 // AI2+BI2 ER2 FI FR , 

nop 

d.pfsub.ss AR4,BR4,FI3 // AR2-BR2 EI2 ER2 FI 

bla decrem,somecount , offsetl.loop 
d.pfsub.ss AI4,BI4,ER4 // AI2-BI2 FR2 EI2 ERnext 

fst.q ER3,16(STore)++ 

// 

end.of f setl-loop ; ; 
d.fiadd.ss fO,fO,fO 
br endit 

fiadd.ss fO,fO,fO 
nop 

// 

.align .quad 
offset-2:: 

// want FEtch=0,l ;4,5 ;8,9 ;12,13 ;. . . elements. 

// ASSUMING wincr=N/4 (W_addr=0,N/4,0,N/4,0, . . . ) . Trivial W( ) factors. 

// USE bla loop, incrementing FEtch by 16 (2*offset). 

// Even-indexed elements identical to offset_l,W=WO, no complex mult. 

// So FReven= (AR-BR) , FIeven= (AI-BI) . 

// Odd components have W=(0,-1). So FRodds (AI-BI ) , FIodd=(BR-AR) , 

// Each fld.q fetches AReven,AIeven,ARodd,AIodd. 

//Assume ER,EI,ERo,EIo are 4 contiguous regs. 

//Assume FR,FI ,FRo ,FIo are 4 contiguous regs. 

adds -16, ast art .FEtch 

fld.q 16 (FEtch) ++,AR 
fld.q 16 (FEtch)++,BR 

adds 0, groups, somecount //bla counter 

//startup the loop: 

/ / / A1 . .... .A2. .... .A3. .... .Write : 

pfadd.ss AR ,BR ,f0 // AR+BRe 

pfadd.ss AI ,BI ,f0 // AI+BIe ER 

d.pfadd.ss ARo,BRo,fO // ARo+BRo El ER 

nop 

d.pfadd.ss AIo.BIo.ER // AIo+BIo ERo El ER 

nop 

d.pfsub.ss AR ,BR ,EI // AR-BRe EIo ERo El 

adds -l.rO.decrem //2 bflies per loop, but groups is half desired value, 

d.pfsub.ss AI ,BI ,ERo // AI-BIe FR EIo ERo 

adds -16,astart ,STore 

d.pfsub.ss AIo,BIo,EIo // AIo-BIo FI FR EIo 

bla decrem, somecount , offset2_loop //init LCC 
d.pfsub.ss BRo,ARo,FR // BRo-ARo FRo FI FR 

nop 
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offset2_loop: ; 
d.fnop 

fld.q 16 {FEtch)++,AR //fetch AR,AI,ARo 

,AIo 


d.fnop 

fld.q 16 {FEtch)++,BR //fetch BR,BI,BRo 

.BIo 


// // A1 A2 

. A3 .... 

..Write; 

d.pfadd.ss AR ,BR ,FI // AR+BRe Flo 

FRo 

FI 

nop 

d.pfadd.ss AI ,BI ,FRo // AI+BIe ER 

Flo 

FRo 

nop 

d.pfadd.ss ARo,BRo,FIo // ARo+BRo El 

ER 

Flo 

fst.q ER ,16(STore)++ 

//update ER ,EI ,ERo,EIo 
d.pfadd.ss AIo,BIo,ER // AIo+BIo ERo 

El 

ER 

nop 

d.pfsub.ss AR ,BR ,EI // AR-BRe EIo 

ERo 

El 

nop 

d.pfsub.ss AI ,BI ,ERo // AI-BIe FR 

EIo 

ERo 

fst.q FR ,16(STore)++ 
d.pfsub.ss AIo,BIo,EIo // AIo-BIo FI 

FR 

EIo 

bla decrem, somecount ,offset2_loop 
d.pfsub.ss BRo,ARo,FR // BRo-ARo FRo 

FI 

FR 

nop 

endit ; : 

// restore regs 
fiadd.ss fO,fO,fO //exit DIM 
fld.q 0(sp),fl2 



fiadd.ss fO,fO,fO //last DIM pair 
fld.q 16(sp),f8 
adds 32,sp,sp 

bri rl 
nop 

// 
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C-— — — — — — — — — — 

c dlfstepf.f: do one stage of fft (DIF) butterflies 
c (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 

c — ~ — — — — — — «« — 

c Decimation in Freq, radix-2, inplace, 1-dimen 
c 6/20/89 

c Do one entire stage (n/2 butterflies). Sample invocation: 
c call difstep(a, w, groups, offset ,wincr) 

c Inputs : 

c A= complex array of input, single-prec float 
c (complex stored as 4byte real, 4byte imag contiguously) 

c W= pointer to array of twiddle factors. Assuming W(k) is 
c CMPLX(cos(2pi*k/N)) ,-sin(2pi*k/N)) for k=0 to (N/2)-l. 

c offset = distance (in "elements") between 
c the 2 input values for each butterfly 

c groups = number of sub-DFTs this stage is split into, 

c (groups*offset*2 = N) 

c wincr = distance between successive w values for successive butterflies 
c 

c Outputs: 

c A= complex butterflied version of input. 

SUBROUTINE difstep (a, w, groups , offset , wincr) 
integer groups, off set , wincr 
integer i, j ,indexl,iplus 

complex a(groups*offset*2) ,w(groups*offset) ,wtemp,temp 

c 

c We implement a... 

c Special case for offset=:l(last stage) : no complex multiplies, simple add 
c (Performance enhancement) 

IF (offset .eq. 1) THEN 
CVD| NODEPCHK 

DO 8 i = 1, (2*groups) ,2 
iplus = i + 1 
temp = a (iplus) 
a (iplus) = a(i) - temp 
8 a(i) = a(i) + temp 

ELSE 

C 

C Special case for offset=2 (next-to-last stage) : no complex multiplies, 
cc simple add. (Performance enhancement) 

cc For half the butterflies, W=(1,0). For the other half, W=(0,-1) 

IF (offset .eq. 2) THEN 
CVD$ NODEPCHK 

DO 90 i = 1, (4*groups) ,4 
iplus = i + 2 
temp = a (iplus) 
a (iplus) = a(i) - temp 
90 a(i) = a(i) + temp 

C 2nd call to i-loop; w=cmplx(0,-l. ) 

CVD$ NODEPCHK 

CVD$ NOVECTOR 

DO 92 i = 2, (4*groups) ,4 
iplus = i + 2 
temp = a(i) - a (iplus) 
a(i) = a(i) + a(iplus) 

92 a(iplus) = CMPLX(AIMAG(temp) ,-REAL(temp)) 
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ELSE 

C 

c "DO 20" indexl-loop is "outer loop" 

CVD$ VECTOR 

CVD$ NODEPCHK 

DO 20 indexl = 1, (2*offset*groups) , (2*offset) 

a = 1 

CVD$ NODEPCHK 

CVD$ ALTCODE 

DO 10 i = indexl, (indexl+off set-1) 
iplus = i + offset 
temp = a(i) - a(iplus) 

a(i) = a{i) + a(iplus) 

a(iplus) = w(j) * temp 
10 j = j + wincr 

20 CONTINUE 
ENDIF 
ENDIF 
RETURN 
END 

cccccccccccccccccccccccccccccccccc 
subroutine fetch{a,n) 
integer n 
complex a(n) ,temp 
cc Kludge do-nothing prefetch, 
temp = a(l) 

RETURN 

END 

cccccccccccccccccccccccccccccccccc 
subroutine bitrev(a, dummy, n) 

C Bit-Reverse 
C Inputs: 

C A=: complex array of input, single-prec float 

C dummy = %val(m). Probably unusable from Fortran. 

C N = number of input points (and output points) 

C Ouput ; 

C A = original A data, but in bit-reversed order from A 

integer n,i , j ,k,ndiv2 
complex a{n) ,temp 
C 

C "DO 7" loop to in-place-bit-reverse-shuffle output 
j=l 

ndiv2 = n / 2 
DO 7 1= 1. n-1 

IF (i .It. j) THEN 
temp = a(j) 
a(j) = a(i) 
a(i) = temp 

ENDIF 
k = ndiv2 

C "While (j .gt. k) " /*decrease j by 2**something */ 

6 IF (j .gt, k) THEN 

j = j-k 
k = k / 2 
GOTO 6 

ENDIF 

C Add next lower power of 2 to j 

7 j = j+k 
RETURN 
END 

C— 
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// 

// bitrev.ss 

// (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 

// 

// BIT-reversal of 8byte array elements. 

// IN PLACE. 

// (Allows arrays of 8,16,32,64,128,256,512, or 1024 elements) 

// 

// INTEL is not responsible for use nor misuse of this code. 

// 

// 8/13/89 

//==================:==================:===:==:=^^^ 

// Invocation; (from Fortran) 

// call bitrev(a,%VAL(m) ) 

// Inputs; 

// a = rl6 = pointer to array of 8byte elements 

// m = rl7 (call by value) = base-2 log of total number of elements 

// (2**m = N) 

// Outputs; 

// a= Bit-reversed ordered version of A 

// 

// Expected best-can-do performance, and measured performances 
// approx 4*N clocks (0.06 mSec for 512 points) 

// 

define (astart , rl6) //initial input data base address 

define (m, rl7) 

define (logN,rl7) 

define (destl,rl9) 

define (dest2,r20) 

define (dest3,r21) 

define (dest4,r22) 

define (iptr, r23) //index-array pointer 

define (decrem,r24) //bla decrement 
define (count ,r25) // bla counter 

.text 

.align .quad 

_bitrev_ ; ; 

_bitr_ ; ; 

//fetch base address for index table (rbasetab) 

// base-addr-table elements = (baseaddr, number.of„swaps-2) 

// base-addr-table indexed by logN. 
shl 3,logN,r30 //scale to 8-byte-entry length 

mov rbasetab, r29 

Id.l r29(r30), iptr 

addu 4,r29,r29 

Id.l r29(r30), count //number of swaps required for this value N 

pfld.d O(iptr) ,f0 //initiate fetch of first 2 bit-rev indices 
pfld.d 8(iptr)++,f0 

adds -2,r0,decrem//2 swaps per loopT 

pfld.d 8(iptr)++,f0 

bla decrem, count , revloop //init LCC 

pfld.d 8(iptr)++,fl6 //get 2 indices, but don’t cache the indices 
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revloop;: //2 swaps per loop 
tf7.b cycles consumed for each swap, best case. . 
pfld.d 8 (ipt'r) ++, fl8 //2 more indices 

fxfr fl6,destl //transfer to integer index regs ' 
fxfr fl7,dest2 

fld.d destl (astart),f24 //fetch 2 elements to swap 
fld.d dest2 (astart) ,f26 
fxfr fl8,dest3 

fst.d f24, dest2 (astart) 

fst.d f26, destl (astart) 

fxfr fl9,dest4 
fld.d dest3 (astart) ,f28 
fld.d dest4 (astart) ,f30 

pfld.d 8(iptr)++,fl6 //2 more indices 

fst.d f28, dest4 (astart) 

bla decrem, count , revloop // 
fst.d f30, dest3 (astart) 

bri rl 

nop 

// 

// _fetch8_; Touch all 32-byte lines in the 8k data bytes, to get them 
// into dcache. (ASSUMING .Ite. 8Kbytes and .gte. 4Kbytes) 

// 

// Invocation= fetch (astart ,num8) 

// Input s= 

// astart=rl6=pointer to data which is to be touched. 

// num8=rl7 (passed by VALUE, %VAL(), not by reference) 

// 

// Using RC and RB to improve dcache hit rates, for FFTs bigger than 
// 1024 complex (8kB) . 

// RC=10 causes replacement only of block denoted by RB Isbit. RC=11 disables 

// replacement. 

// 

define (num8,rl7) 
define (FEtch, r26) 

«fetch8« : ; 

.fetch_ ; ; 

Id.c dirbase,r30 

or 0x800, r30,r30 // Replace Dcache slot 0 only (RC=10 ,RB=00) 
st.c r30,dirbase 

// Put 4Kbytes into Dcache slot 0. (The rest after 4kB goes to slot!), 
adds -4, rO, decrem //4 8-byte-groups per cache line 

adds 508, rO, count //512, but pre-decremented for bla usage 

bla decrem, count ,f loop 

adds -32, astart , FEtch 

floop ; ; 

bla decrem, count , floop 

fld.d 32(FEtch)++,f30 //dummy load. 

adds -512,num8, count 

be fdone //if data exhausted, quit 

// Id.c dirbase,r30 

or 0x900, r30,r30 // Replace Dcache slot 1 only (RC=:10,RB=01) 
st.c r30,dirbase 
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adds -8, count , count //predecr for bla 

bla decrem, count ,floop2 //set LCC 

fld.d 32{FEtch)++,f30 

floop2:: 

bla decrem, count ,floop2 

fld.d 32(FEtch)++,f30 //dummy load, 

f done ; ; 

// unlock dcache 

andnot 0xF00,r30,r30 //clear RC,RB (dirbase (11 :8) ) 
st.c r30, dirbase 
bri rl 

nop 

.data 

// 

// rbasetab:: (Table of bit-reversed indices for bitrev subroutine) 
// base-addr-table elements = (baseaddr, number_of_swaps-2) 

// base-addr-table indexed by logN. 

.align .quad 
rbasetab : ; 

.long [6]0 //don’t bother with log(n)=0,l,2 

.long rev8, 0 

.long revl6, 4 

.long rev32, 10 

.long rev64, 26 

.long revl28, 54 

.long rev256, 118 

.long rev512, 238 

.long revl024, 494 

//number of swaps=:240 for N=;512 (ie, 32 symmetrical patterns 
// exist between 0 and 511.) 

// rev512; array of bit-reversed indices, for N=512. 

// Each entry is ("i", and "bit-reversed-i") , shifted left by 3 
// to account for 8-byte-elements. 

// NOTE: This listing DOES NOT SHOW all the table elements, to save paper. 

.align .quad 
rev512 ; : 

.long 8, 2048, 16, 1024 

.long 24, 3072, 32, 512 

.long 40, 2560, 48, 1536 

// ETC..., ETC...., ETC... 

//=============== 

.align .quad 
revl024 : : 

.long 8, 4096, 16, 2048 

.long 24, 6144, 32, 1024 

.long 40, 5120, 48, 3072 

.long 56, 7168, 64, 512 

// ETC..., ETC...., ETC... 
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//Number of swaps = 496 

//N (Number of elements) = 1024 

.align .quad 
revl6 ; ; 


.long 1*8, 8*8, 2*8, 4*8 
.long 3*8,12*8,5*8,10*8 
.long 7*8,14*8,11*8,13*8 
rev8 ; ; 


.long 1*8, 4*8, 3*8, 6*8 
//================= 

•align .quad 


rev32; : 

.long 8, 128,16, 64, 24, 192, 40, 160, 48, 96, 56, 224 
.long 72, 144, 88, 208, 104, 176, 120, 240, 152, 200, 184, 232 
//=======:==========: 

.align .quad 


rev64; ; 
.long 

8, 

256, 

16, 

128 

.long 

24, 

384, 

32, 

64 

.long 

40, 

320, 

48, 

192 

.long 

56, 

448, 

72, 

288 

// ETC.. 

. , ETC. ... 5 

, ETC.. 

. 


//==========: 

.align .quad 


revl28 ; ; 
.long 

8, 

512, 

16, 

256 

.long 

24, 

768, 

32, 

128 

.long 

40, 

640, 

48, 

384 

.long 

56, 

896, 

72, 

576 

// ETC.. 

. , ETC. . . . , 

, ETC.. 

. 


//Number of swaps = 56 (Number of elements) =128 
//================:= 

.align .quad 


rev256 : 

: 



.long 

8, 1024, 

16, 

512 

.long 

24, 1536, 

32, 

256 

.long 

40, 1280, 

48, 

768 

.long 

56, 1792, 

64, 

128 

// ETC. 

.., ETC 

ETC. 

• • 

//Number of swaps = 

120, 

N (Ni 


2-417 





ini^. 


AP-435 




PROGRAM FFTTEST 
C 

C 1-D FFT TEST PROGRAM 
C 

C Intel assumes no responsibility for use or misuse of this code 
C 

C 7/20/89 

C 

C 

character *8 REALLY 
PARAMETER (IREV=0) 

PARAMETER (REALLY= ’ complex * ) 

PARAMETER (TIMEIT=1, CACHETIME=0) 

DATA IT/200000/ 
c PARAMETER {N=1024,M=10) 

PARAMETER {N=512,M= 9) 
c PARAMETER {N=256,M=: 8) 

c PARAMETER (N=128,M= 7) 

c PARAMETER (N=64,M= 6) 

c PARAMETER (N=:32,M= 5) 

c PARAMETER (N=16, M=4) 

PARAMETER (PI=3. 1415926536) 

COMPLEX X(N) ,X1(N) ,X2(N) ,X3(N) , W(N/2) 
c Fortran complex values stored R,I, R,I for arrays. 

Real ASQR(N) ,ASQR2(N) ,XR(N) 
complex wtemp 
real rtemp 
C 

PRINT *,* FFT test program (ffttest.f) 
print * , ’ ====================:=====:===:=== » 

IF (IREV .eq. 0) THEN 

print *,*NOT counting time for bit-reversal.* 

print *,'D0 NOT expect matching answers, without bit-rev* 

ELSE 

print *, *Time for bit-reversal included.* 

ENDIF 

print *Time for cache writeback and fills...* 

IF (CACHETIME .eq. 0) THEN 
print *,* NOT included, if iterating.* 

ELSE 

print *,* ... included.* 

ENDIF 

print * , * ===================:==:====:====:= * 

print *, *If iterating... Number of Iterations =*,IT 

print *, *Number of Points = *, N 

print *, *(*, REALLY,* data)* 

print *, *=================:======:==:==:=:====» 
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C Init twiddle factor array w(k) with (cos, -sin) of 2pi*k/N 
C (Should just declare this as constant, if N is non-variable) 

C (OR could have one constant 512-entry W (for N=1024) , adjust wincr accordingly 
C in diff.f for smaller N) 
rtemp = 2.0*pi/N 

wtemp= CMPLX (cos (rtemp) , -sin(rtemp)) 
w(l) = (1.0, 0.0) 

DO 200 k = 2,N/2 

200 w(k) = wtemp * w(k-l) 

cc print “f*,* W (twiddle) initialization completed * 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
C INITIALIZE input data 
C 

PIN = (4*PI)/ N 
DO 100 I = 1, N 

c For testing with sinewave input data: 
c Treal = C0S( I*PIN) 

c Timag = SIN( I*PIN) 

c For testing with squarewave input; 
cc IF (I .It. N/2) THEN 

cc Treal = 1.0 

cc Timag = 0.5 

cc ELSE 

cc Treal = 0.0 

cc Timag = 0.0 

cc ENDIF 

C For testing with ramp function input data: 

Treal =1-1.0 

Timag = Treal + 0.5 

X(I) = CMPLX (Treal, Timag) 

XI (I) = CMPLX (Treal, Timag) 

X2{I) = CMPLX (Treal, Timag) 

X3(I) = CMPLX (Treal, Timag) 

100 CONTINUE 
C 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
IF (TIMEIT .ne. 0) THEN 

CALL fft (X2, M, N) 

cc Subroutine fft is Decimation-In-Time, Fortran version. 

c CALL ditt(X, M, N,W,IREV) 

CALL diff(X, M, N,W,IREV) 

ENDIF 

ccccccccccccccccccccccccccccccccccccccc 
IF (IREV .ne. 0) THEN 
IF (TIMEIT .eq. 0) THEN 
call vcompare (X,X2,2*N) 
call cmags (X,N,ASQR) 

c cmags to take squared magnitude of complex values 
call cmags (X2,N,ASQR2) 
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c c 

C print non-zero results: 

JsO 

DO 700 I = l.N 

IF {(ASQR(I) .GT. 1.0) .OR. (ASQR2(I) .GT. 1.0)) THEN 
WRITE (6,22) (I-l), ASQR(I), ASQR2(I) 

22 FORMAT (* I-1=»,I4,* ASQR(I)=: •,F14.2, • ASQR2(I)= » ,F14.2//) 
J = J+1 

IF (J .GT. 32) GOTO 725 
ENDIF 

700 CONTINUE 

725 CALL TIME 
ENDIF 
ENDIF 

IF (TIMEIT .ne. 0) THEN 
ccccccccccccccccccccccccccccccccccccccc 
cc- Timing loop follows; 

print *,* Start Ass.FFT* 

IF (CACHETIME .eq. 0) THEN 
DO 500 1=1, IT, 4 

C Reuse same array, so cache fill and writeback time NOT included. 
CALL diff(X, M, N,W,IREV) 

CALL diff(X, M, N,W,IREV) 

CALL diff(X, M, N,W,IREV) 

500 CALL diff(X, M, N,W,IREV) 

ELSE 

DO 504 1=1, IT, 4 

C Alternating between X,X1,X2,X3 should provide cache misses. 

CALL diff(X, M, N,W,IREV) 

CALL diff(Xl, M, N,W,IREV) 

CALL diff(X2, M, N,W,IREV) 

504 CALL diff(X3, M, N,W,IREV) 

ENDIF 

print END Ass. FFT* 
ccccccccccccccccccccccccccccccccccccccc 
ENDIF 
STOP 
END 
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c c 

subroutine vcompare (res,exp,n) 

c VCOMPARE compares 2 REAL vectors, prints out 1st few miscompares 
c 

integer n, errcnt 
real res(n), exp(n) 

write (6,12) 

12 format ('*** VCOMPARE: vector comparison beginning ***») 

data errcnt/0/ 

do 30 i = l,n 

if (AINT(res(i) ) ,ne. AINT(exp(i) ) ) then 
c {print out error, exit if alot already) 


120 

print *,»*** Error in compares **** 
write(6,121) i 

121 

format (* Item number = *,I6) 
write (6,124) res(i), exp(i) 

124 

format (* Res^s* ,F14.2, * Expected.=* ,F14.2) 
errcnt = errcnt + 1 
if (errcnt .gt. 19) then 


return 


end if 

end 

if 

30 continue 


if (errcnt .eq. 

0) then 

190 print *** 

vector compares SUCCESSFUL **** 

end if 


99 return 


end 
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c 

C File; ditt.f 
C 6/15/89 

C Intel assumes no responsibility for use or misuse of this code. 

C FFT - Decimation in TIME, radix-2, inplace, 1-dimen 
C Inputs; 

C A= complex array of input, up to 1024 pts, single-prec float 

C Ms log of number of pts 

C = (Number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 
C W= complex array of twiddle factors, length=N/2. 

C REV= ignored parameter. 

C 

C Outputs; 

C A= complex fft of input A. Correct order- (bit-reversal done) . 
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

subroutine ditt (a,m,N,W,REV) 
integer m,N, i, REV,wlimit 

integer offset, stage, groups, wincr ,powers2(0 ;10) 
complex a(n) ,w(N/2) ,temp 

data power s2 /I, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024/ 

C Powers2 to avoid calls to POW, DIV 

C Twiddle factor array w(i) has (cos, -sin) of 2pi’*'i/N 

CC Assume the caller provides w(i) constants ALREADY initialized 

C 

C Pre-touch data, lock into cache, for 8kByte fft; 

IF (N .gt. 513) THEN 

call fetch(a,%VAL(n) ) 

ENDIF 

C 

call bitrev(a,%VAL(M) ,n) 

C Bitreversal of input needed for in-place decim in time FFT, to avoid 
C fetching twiddle-factors in bitrev order, 
wlimit = 8*((N/2) - 1) 

DO 20 stage = l,m 

groups = powers2(m-stage) 

C groupssnumber of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 

C offset gets 1,2,4,8, . . .N/2 

offset = powers2 (stage-1) 
wincr = groups 

call dit St ep (a, w, groups , offset , wincr, wlimit ) 

20 CONTINUE 

RETURN 

END 

C 
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// 

// ditstep.ss; do one stage of fft butterflies 

// DIT = Decimation in Time, radix-2, inplace, 1-dimension 

// (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 

// 7/15/89 

// 

// Intel is not responsible for use nor for misuse of this program. 

// 

// Do one entire stage (n/2 butterflies). Sample invocation: 

// call ditstep(a, w, groups, offset ,wincr,wlimit) 

// Inputs: 

// A= complex array of input, single-prec float 
// (complex stored as 4byte real, 4byte imag contiguously) 

// W= pointer to array of twiddle factors. Assuming W(k) is 
if CMPLX(cos(2pi*k/N)) ,-sin(2pi*k/N)) for k=0 to (N/2)-l. 

// offset = distance (except for scale-by-8byte sizeof (complex) ) between 

// the 2 input values for each butterfly. 

// Offset also is the number of butterflies done per "group". 

// groups = N/ (2*offset ) . The number of sub-DFTs this stage is split into. 
// wincr = distance (except for scale-by-8byte sizeof (complex) ) between 
// successive w values for successive butterflies 

// wlimit =max index, in bytes, of W table. 

// 

// Outputs: 

// A= complex radix-2 butterflied version of input. 

// 

// 

define (astart , rl6) // input data base address 

define (wstart ,rl7) //twiddle array ptr. Because w-contents depend on N, 

// we will assume the caller has initialized w() array. 

define (groups, rl8) //groups=number of sub-DFTs this stage is split into. 

define (offset ,rl9) //offset (initially elements, mult by 8 to get bytes) 

// between node and its dual (the 2 numbers to butterfly, ie. A and B) 
define (wincr, r20) //increment between successive W values. Remains constant 
// within a given stage. 

define (wlimit ,r21) //max index, in bytes, of W table, 
define (wind, r22) //current index, in bytes, of W table, 
define (offset2,r23) //offset*2 

define (decrem,r24) //bla decrement 
define (somecount ,r25) // bla counter 

define (FEtch, r26) //pointer to 1st component of butterfly (load) 
define (STore ,r27) // " " 1st component of butterfly (store) 

define (off setp8,r28) //offset+8 
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// f4;f7 spare 

define (ARe,f 12) //element A, real component 
define! Ale, f 13) // " " , imag 

define (AROjf 14) // extra A value, for prefetch (os^odd") 
define (AIo,f 15) 

define {BRe,f 16) //element B, real component 
define {BIe,fl7) 

define {BRo,f 18) // extra B value, for prefetch 
define(BIo,fl9) 

define (ERe,f20) //A+(B*W) , real (ER = AR + BR) 
define (EIe,f21) // " imag " 

define (ERo,f22) // previous loop’s value 

define (EIo,f23) // " imag " 

define {FRe,f 24) //A-(B*W), real 
define (Fie, f25) // " imag " 

define (FRo,f 26) // previous loop’s value 
define (Flo, f27) // " imag " 

define (PR, f28) //(B*W), real 
define(PI, f29) //(B*W), imag 

define (WRe,f 30) //W (twiddle factor), real part 

define (WIe,f 31) // " " , imag 

define (WRo,f 10) //W (twiddle factor), real part (EXTRA copy) 

define (WIo,f 11) // " " , imag 

.text 

.align .quad 
_ditstep_:; 

Id.l 0 (groups) , groups //fix Fortran call-by-ref 

Id.l O(offset) , offset // 

shl 3, offset , offset // change from elements to bytes 

shl 1, offset ,offset2 

adds 8, offset ,offsetp8 

fst.q f8 ,-16(sp)++ //save "local" regs 

fst.q fl2,-16(sp)++ // " " 

adds -1, groups, groups // pre-decrement for bnc usage, or bla usage 

adds -16,r0,decrem //bla decrement 

// We code the last 2 stages as special cases: 

// 

xor 8, off set, rO //offset=l, special case, no complex inult, funny addressing 

be off set. 1// (ASSUMING offset=l means winersO, and no twiddle used) 

xor 16, off set, rO //offset=2, special case, no complex mult 

be offset.2 

// 

Id.l O(wincr) ,wincr 

Id.l O(wlimit) ,wlimit 
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pfadd.ss fO.fO.fO 
pfadd.ss fO.fO.fO 

pfadd.ss fO.fO.fO // init A1,A2,A3=0 
pfmul.ss fO,fO,fO 
pfmul.ss fO,fO,fO 
pfmul.ss fO,fO,fO 

// 

II init pointers; 

shl 3,wincr,wincr //scale for bytes, 

shl l,wincr,wind //init wind =2*wincr 

pfld.d 0 ( wstart),fO 
pfld.d wincr { wstart),fO 
adds -8,astart ,rEtch 

pfld.d wind (wstart) ,f0 

adds wincr, wind, wind //wind now 3*wincr 

// here fetch first set of B,W before bla-loop 
pfld.d wind (wstart) ,WRe 
adds wincr, wind, wind 

//first Bfetch from offset, then 1st afetch from 0. 
fld.d offsetpS (FEtch) ,BRe //first B value 

and wlimit , wind, wind //modulo-wlimit the w index 

// We do modulo-addressing on W(), to keep the pfld pipeline full. We 
// never do a W- fetch beyond the end of the table. 

// And the modulo-check needs to be done only every 4th pfld, as always 
// we use a multiple of 4 W() factors. 

d.r2apl.ss fO,fO,fO //clear Treg. 

adds -32, offset ,somecount // bla counter (predecrement by 4 elements) 

// 

// Definitions for pipe diagram; 

// Anew = E = A+(B*W) 

// Bnew = F = A-(B*W) 

// Let P=(B*W). 

// 

// (the complex multiply product, P, broken into 4 real mult and 2 adds) 
// WR = cos() , WI=-sin() . 

// PR = K - L; where K= WR*BR, L=WI*BI 
// PI = N + M; where Ns WI*BR, M=WR*BI 
// ER = AR + PR (Overwrites AR) 

// El = AI + PI ( " AI) 

// FR = AR - PR ( " BR) 

// FI = AI - PI ( " BI) 

// For 1st time thru inner — loop, don’t have correct values to store. 

// Must do 1 loop before the loop, sans the stores. 

// 

first_bfly;; //fill pipe 
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// KR.. 

.KI.. 

.Ml... 

.M2... 

.M3 

T 

Al... 

. A2 ... 

.A3... 

.Write 

d.r2pt.ss WRe,fO,fO // WRe 

- 

- 

- 

- 

- 

- 

- 

- 

- 

pfld.d wind (wstart) ,WRo 
d.i2st.ss WIe.fO.fO // 
adds wincr, wind, wind 

d.r2apl.ss fO ,BRe,fO // 

WIe 

KO 








fld.d 8 (FEtch)++,ARe //first A value 








d.pfmul.ss WIe,BIe,fO // 


LO 

KO 

- 

- 

- 

- 

- 

- 

pfld.d wind (wstart) ,WRe 
d.r2pt.ss WRo,BIe,fO // WRo 


MO 

LO 

KO 



•• 



fld.d offsetpS (FEtch) ,BRo 
d.ratls2.ss fO ,PR ,f0// 



MO 

LO 

KO 





adds wincr, wind, wind 

d.i2st.ss WIo,BRe,fO // 

WIo 

NO 


MO 

KO 

K-LO 




nop 

// 

d.r2apl.ss fO ,BRo,fO // 


KI 

NO 


MO 


PRO 



and wlimit , wind, wind 

d.pfsub.ss fO ,PI ,f0 // 


KI 

NO 


MO 



PRO 


fld.d 8 (FEtch) ++,ARo 
d.pfadd.ss ARe,PR ,PR // 


KI 

NO 


MO 

ERO 



PRO , 

fld.d offsetp8 (FEtch) ,BRe 
d.pfmul.ss WIo,BIo,fO // 


LI 

KI 

NO 

MO 

ERO 




nop 

d.r2pt.ss WRe, Bio, fO // WRe 


Ml 

LI 

KI 

MO 

M+NO 

ERO 



bla decrem,somecount , 

restart //init LCC 






d.ratls2.ss ARe,PR ,f0// 


- 

Ml 

LI 

KI 

FRO 

PIO 

ERO 

- 

nop 

restart;; 

d.i2st.ss WIe,BRo,ERe// 

WIe 

N1 , 


Ml 

KI 

K-Ll 

FRO 

PIO 

ERO 

adds -16,astart ,STore 

// ptrs init 16 

low. 

for fst.q instructions 


// 










// Each butterfly = 1 complx multiply, 1 

complx add 

, 1 complx 

subtract 


// = 4 multiply, 3 add 

, 3 subtract 







// 3 8-byte fetches 

(A, B 

. W) 








// 2 8-byte stores (A, B) 









// 

// 7 cycles per butterfly 
// 










// inner.loop; iterates "offset/2" 

times 








// for each group. It does 

2 butterflies per 

iteration 





// AR/AI fetches need to be a 

cycle behind BR/BI fetches 

here. 

So we 



// must index with offset+8 

into 

B. 








// AR is used 1/2 loop before 

AI. 









// Patterns AI0,AR1,BR2,BI2 ;AI1,AR2,BR3, 

BI3. 







inner.loop;; // KR.. 

.KI.. 

.Ml... 

. M2 ... 

.M3 

T 

Al... 

. A2 . . 

.A 3.. 

. .Write 

d.r2apl.ss AIe,BRe,PI // 


K2 

N1 

- 

Ml 

EIO 

PRl 

FRO 

PIO 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss Ale, PI ,FRe// 


K2 

N1 


Ml 

FIO 

EIO 

PRl 

FRO 

fld.d 8(FEtch)++,ARe 
d.pfadd.ss ARo,PR ,PR // 


K2 

N1 


Ml 

ERl 

FIO 

EIO 

PRl 

fld.d offsetp8 (FEtch) ,BRo 
d.pfmul.ss WIe,BIe,fO // 


L2 

K2 

N1 

Ml 

ERl 

FIO 

EIO 

Ml 

adds wincr, wind, wind 
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d.r2pt.ss WRo,BIe,EIe // WRo 
pfld.d wind (wstart) ,WRe 

M2 

L2 

K2 


M+Nl 

ERl 

FIO 

EIO 

d.ratls2.ss ARo.PR ,FIe// 
adds wincr, wind, wind 

" 

M2 

L2 

K2 

FRl 

PIl 

ERl 

FIO 

d.i2st.ss WIo,BRe,ERo// WIo N2 

and wlimit , wind, wind //modulo. 


M2 

K2 

K-L2 

FRl 

PIl 

ERl 

// KR.. 

.KI...M1.. 

.M2.. 

..M3 

T 

Al... 

.A2.. 

. . A3 . . . 

.Write 

d.r2apl.ss AIo,BRo,PI // 
nop 

K3 

N2 

“ 

M2 

Ell 

PR2 

FRl 

PIl 

d.pfsub.ss AIo,PI ,FRo// 
fld.d 8 (FEtch)++,ARo 

K3 

N2 


M2 

FIl 

Ell 

PR2 

FRl 

d.pfadd.ss ARe,PR ,PR // 

fld.d offsetpS (FEtch),BRe 

K3 

N2 


M2 

ER2 

FIl 

Ell 

PR2 

d.pfmul.ss WIo, Bio, fO // 
nop 

L3 

K3 

N2 

M2 

ER2 

FIl 

Ell 

— 

d.r2pt.ss WRe,BIo,EIo // WRe M3 L3 K3 

fst.q ERe,16(STore)++ //update ERe/EIe/ERo/EIo 


M+N2 

ER2 

FIl 

Ell 

d.ratls2.ss ARe,PR ,FIo// 
bla decrem, somecount , inner_ 

loop 

M3 

L3 

K3 

FR2 

PI2 

ER2 

FIl 

d.i2st.ss WIe,BRo,ERe// WIe N3 - M3 K3 K-L3 

fst.q FRe, offset (STore) 

//update FRe/FIe/FRo/FIo 

end_inner_loop: : //KEEP Pipelines full 
// RE-init pointers for fetches 
d.fiadd.ss fO,fO,fO 

adds offset2,astart ,astart //bump to next group 

//redo A,B fetches, with proper ptr. 
d.fiadd.ss fO,fO,fO 

fld.d offset (astart) ,BRe //get first BR/BI in next group 
d.fiadd.ss fO,fO,fO 
adds -8, astart ,FEtch 

last_bfly:: //do final 2 butterflies, start next group 

FR2 

PI2 

ER2 

// KR.. 

.KI...M1.. 

..M2.. 

..M3 

T 

Al... 

. A2 . . 

. . A3 . . . 

.Write 

d.r2apl.ss AIe,BRe,PI // 
pfld.d wind (wstart) ,WRo 

KO 

N3 

” 

M3 

EI2 

PR3 

FR2 

PI2 

d.pfsub.ss Ale, PI ,FRe// 
fld.d 8(FEtch)++,ARe 

KO 

N3 

“ 

M3 

FI2 

EI2 

PR3 

FR2 

d.pfadd.ss ARo,PR ,PR // 
fld.d offsetp8 (FEtch) ,BRo 

KO 

N3 


M3 

ER3 

FI2 

EI2 

PR3 

d.pfmul.ss WIe,BIe,fO // 
adds wincr, wind, wind 

LO 

KO 

N3 

M3 

ER3 

FI2 

EI2 


d.r2pt.ss WRo,BIe,EIe // WRo 
pfld.d wind (wstart) ,WRe 

MO 

LO 

KO 


M+N3 

ER3 

FI2 

EI2 

d.ratls2.ss ARo,PR ,FIe// 
adds wincr, wind, wind 


MO 

LO 

KO 

FR3 

PI3 

ER3 

FI2 

d.i2st.ss WIo,BRe,ERo// 
and wlimit , wind, wind 

// 

WIo NO 
//modulo 


MO 

KO 

• • « • 

K-LO 

FR3 

PI3 

ER3 

d.r2apl’.ss AIo,BRo,PI // 
adds -32, offset , somecount // 

K1 

reset bla 

NO 

counter 

MO 

EI3 

PRO 

FR3 

PI3 

d.pfsub.ss AIo,PI ,FRo// 
fld.d 8 (FEtch) ++,ARo 

K1 

NO 


MO 

FI3 

EI3 

PRO 

FR3 
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d.pfadd.ss ARe.PR ,PR// K1 NO - MO ERO FI3 EI3 PRO 

fld.d offsetpS (FEtch) ,BRe 

d.pfmul.ss WIo,BIo,fO // LI K1 NO MO ERO FI3 EI3 

bla decrem, somecount .nowhere //re-init LCC=1 

d.r2pt.ss WRe.BIo.EIo // WRe Ml LI K1 M+NO ERO FI3 EI3 

adds -1, groups, groups 
nowhere : : 

d.ratls2.ss ARe.PR .Flo// - Ml LI K1 FRO PIO ERO FI3 

fst.q ERe,16(STore)++ 
d.fnop 

bnc.t restart //branch on value of groups 
d.fnop 

fst.q FRe, offset (STore) 

end_last_bfly ; ; 
d.fnop 
br endit 

fiadd.ss fO.fO.fO 

fst.q FRe, offset (STore) .//repeated for bnc.t untaken case 
.align .quad 

offset.l;; 

// want FEtch=:0,2,4,6,8, . . . elements. ASSUMING wincr=0, 

// and that w=(l,0), so that no complex mult needed. 

// E=A+B, F=A-B. (Per double-butterfly loop; 8 pfadd,4 dword fid, 4 fst, 

// 1 bla) (fld.q used to reduce # fids) 

// Performance = 4 cyc/bfly best case. 

//Redefine regs for fld.q, fst.q usage, when A and B adjacent; 
define (AR3,f 12) //element A, real component 

define (AI3,f 13) // " " , imag 

define (BR3,f 14) //element B, real component 
define (BI3,fl5) 

define (AR4,f 16) // extra A value, for prefetch 
define (AI4,fl7) 
define (BR4,f 18) 
define (BI4,fl9) 

define (ER3, f20) //A+B, real (ER = AR + BR) 

define (EI3, f21) // " imag " 

define(FR3, f22) //(A-B), real 

define (FI3, f23) // " imag 

define (ER4,f 24) //A+B, real 
define (EI4,f 25) // " imag 
define (FR4,f 26) //(A-B), real 
define (FI4,f27) // " imag 
//==========:=:====================:======^^^ 

adds -16, astart , FEtch 

fld.q 16 (FEtch)++,AR4 

adds -1, groups, somecount // bla counter (predecremented already by 1) 

//using group s=blacount on the offset_l loop, intentionally, 
adds -16, FEtch, STore 

//startup the loop; 
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// 

d.pfadd.ss 

AR4,BR4,fO 

// ARn+BRn - 



fld.q 16 

(FEtch) ++,AR3 




d.pfadd.ss 

AI4,BI4,fO 

// Aln+BIn ERn 

- 

- 

adds 

-2,rO,deGrem //2 bflies per loop 


d.pfsub. ss 

AR4,BR4,fO 

// ARn-BRn Eln 

ERn 

- 

bla decrem.somecount , 

offsetl_loop //init LCC 


d.pfsub. ss 

AI4,BI4,ER4 

// Aln-BIn FRn 

Eln 

ERnext 

nop 






// 

— 

-// A1 

..A2.... 

» . • A3 a . . I 

, . .Write ; 

offsetl«loop ; : 





d.pfadd.ss 

AR3,BR3,EI4 

// AR+BR 

FI- 

FR- 

El- 

nop 






d.pfadd.ss 

AI3,BI3,FR4 

// AI+BI 

ER 

FI- 

FR- 

fld.q 16 

(FEtch) ++,AR4 




d.pfsub. ss 

AR3,BR3,FI4 

// AR-BR 

El 

ER 

FI- 

fst.q 

ER4, 16 (STore) ++ 




d.pfsub. ss 

AI3,BI3,ER3 

// AI-BI 

FR 

El 

ER 

nop 






d.pfadd.ss 

AR4,BR4,EI3 

// AR2+BR2 FI 

FR 

El 

fld.q 16 

(FEtch) ++,AR3 




d.pfadd.ss 

AI4,BI4,FR3 

// AI2+BI2 ER2 

FI 

FR 

nop 






d.pfsub. ss 

AR4,BR4,FI3 

// AR2-BR2 EI2 

ER2 

FI 

bla decrem,somecount , 

offsetl«loop 



d.pfsub. ss 

AI4,BI4,ER4 

// AI2-BI2 FR2 

EI2 

ERnext 


fst.q ER3,16(STore)++ 

// 

end_offsetl_loop : ; 
d.fiadd.ss fO,fO,fO 
br endit 

fiadd.ss fO,fO,fO 
nop 

// 

.align .quad 
offset_2; ; 

// want FEtch=0,l ;4,5 ;8,9 ;12,13 ;. . . elements. 
// ASSUMING wincr=N/4 (W_addr=0,N/4,0,N/4,0, . 


.) . Trivial W( ) factors. 


// Even-indexed elements identical to offset«l,W=WO, no complex mult. 
// So EReven=(AR+BR) , EIeven=(AI+BI) . 

// So FReven=(AR-BR) , FIeven=(AI-BI) . 

// Odd components have W=(0,-1). So B*W = (BI,-BR). 

// So ERodd=Re(A+(B*W)) = (AR+BI) EIodd=(AI-BR) . 

/// So FRodd=Re(A-(B*W)) = (AR-BI) FIodd=(AI+BR) . 

// Each fld.q fetches AReven,AIeven,ARodd,AIodd. 

//Assume ERe ,EIe,ERo,EIo are 4 contiguous regs. 

//Assume FRe ,FIe ,FRo ,FIo are 4 contiguous regs. 

//Assume ARe ,AIe ,ARo ,AIo are 4 contiguous regs. 
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adds -16,astart ,FEtch 

fld.q 16 (FEtch)++,ARe 
fld.q 16 (FEtch)++,BRe 

adds 0, groups, some count //bla counter 

//startup the loop; 

/ / / A1 • • • • • aAS* • • • • aAS* • • • • •Write • 

pfadd.ss ARe,BRe,fO // AR+BRe 

pfadd.ss AIe,BIe,fO // AI+BIe ER 

d. pfadd.ss ARo,BIo,fO // ARo+BIo El ER - 

nop 

d.pfsub.ss AIo,BRo,ERe // AIo-BRo ERo El ER 

nop 

d.pfsub.ss ARe,BRe,EIe // AR-BRe EIo ERo El 

ads -l,rO,decrem //2 bflies per loop, but groups is half desired value, 

d.pfsub.ss AIe,BIe,ERo // AI-BIe FR EIo ERo 

adds -16,astart ,STore 

d.pfsub.ss ARo,BIo,EIo // ARo-BIo FI FR EIo 

bla decrem,somecount , offset2_loop //init LCC 
d. pfadd.ss AIo,BRo,FRe // AIo+BRo FRo FI FR 

nop 

offset2_loop; ; 
d.fnop 

fld.q 16 (FEtch)++,ARe//fetch AR,AI,ARo,AIo 
d . f nop 

fld.q 16 (FEtch)++,BRe 

// — — — — // A1......A2 A3. .... .Write : 

d. pfadd.ss ARe,BRe,FIe // AR+BRe Flo FRo FI 

nop 

d. pfadd.ss AIe,BIe,FRo // AI+BIe ER Flo FRo 

nop 

d. pfadd.ss ARo,BIo,FIo // ARo+BIo El ER Flo 

fst.q ERe,16(STore)++ //update ER ,EI , ERo, EIo 

d.pfsub.ss AIo,BRo,ERe // AIo-BRo ERo El ER 

nop 

d.pfsub.ss ARe,BRe,EIe // AR-BRe EIo ERo El 

®nop 

d.pfsub.ss AIe,BIe,ERo // AI-BIe FR EIo ERo 

fst.q FRe ,16(STore)++ 

d.pfsub.ss ARo,BIo,EIo // ARo-BIo FI FR EIo 

bla decrem,somecount ,offset2_loop 
d. pfadd.ss AIo,BRo,FRe // AIo+BRo FRo FI FR 

nop 

endit : : 

// restore regs 
fiadd.ss fO,fO,fO //exit DIM 
fld.q 0(sp),fl2 
fiadd.ss fO,fO,fO //last DIM pair 
fld.q 16(sp),f8 
adds 32,sp,sp 
bri rl 
nop 

//=====:=====:==:==:===:============:==:=:=:===^^ 
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c 

C File: dirr.f 

C FFT - Decimation in Freq, radix-2., inplace, 1-dimen, 

C REAL input 

C Intel is not responsible for use nor misuse of this code, 

C 8/14/89 
C Inputs; 

C A= REAL array of input, up to 1024 pts, single-prec float 
C M= log of number of pts 
C = (Number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 
C W= complex array of twiddle factors, length N/2. 

C REV= 0 if bit reversed output ok. l=must re-order output 
C (REV will be ignored, and output will be properly ordered. Bit 

C reversal WILL be done.) 

C 

C Outputs; 

C A= complex fft of input A, but only the positive frequency half. 

C Length = N/2+1 complex numbers. A(0:n/2) 

C 

subroutine dirr (a,m,N,W,REV) 
integer m,N, i, j,k, REV,wlimit 

integer offset, stage, groups, wincr,powers2(0 ;10) 
real a(N) 

complex w(N/2) ,temp 

data powers2 /I, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024/ 

C Powers2 to avoid calls to POW, DIV 

C Twiddle factor array w(k) has (cos, -sin) of 2pi*k/N 

CC Assume the caller provides w(k) constants ALREADY initialized 

C 

C Pre-touch data, for 8kByte fft; (2048 points real) 

IF (N ,gt, 1025) THEN 

call fetch(a,%VAL(n/2)) 

ENDIF 

C 

wlimit = 8* ((N/2) - 1) 

C "DO 20" stage-loop; doing Complex FFT on length N/2 array. Twiddles are 
C for a length N array, so wincr gets scaled by 2. 

DO 20 stage = l,m-l 

groups = powers2 (stage-1) 

C groups=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 

C offset gets N/4,N/8,N/16, . . . 

offset = powers2(m-l-stage) 
wincr = groups * 2 

call difstep(a,w,groups,offset , wincr, wlimit) 

20 CONTINUE 

call bitrev(a,%VAL(M-l) ,n/2) 
call realfix(a,w,%VAL(n) ) 

RETURN 

END 

C 
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// realfix.ss; This is i860(tin) CPU assembly code to revise data from an 
// N/2 length Complex FFT. 

// (assumes the input data fed to Complex FFT was N real values) 

// 

// INTEL is not responsible for use nor misuse of this code. 

// 

// 8/14/89 

// This 18-cycle-butterfly loop may be sub-optimal. 

// 

// output = overwrite the data array used for input. Results are 
// complex. ReO,ImO,Rel,Iml, . . . , Re (N/2) , Im(N/2) . 

// NOTE that output array is 1 element longer than input. 

// 

// Input is H(k), output is F(k)... 

// F(k)=.5*( H(k)+ Hconj (N/2-k) -j*(H(k) -Hconj (N/2-k) ) *Wconj (k) ) 

// 

// Algorithm from "Numerical Recipes in C", by Flannery, Press, Teukolsky, and 
// Vetterling, Cambridge Univ. Press 1988, p.417. 

//* The C-version of realfix; */ void realfix_(a,w,n) 

///*Input = 

// a(0;n+l) : length n/2+1 complex array. Entries 0;n/2-l are the complex FFT 
// * result, in correct (NON BIT REVERSED) order. Entry n/2 is undefined. 

// * w; length n/2 complex array of twiddles. (cos,-sin(2pi*k/n) ) 

// * n: call-by-value, number of REAL input samples 

// ^Output = 

// * a(0;n+l) ; length n/2+1 complex array. 

// * Format is ReO,ImO,Rel,Iml, . . . , Re (N/2) , Im(N/2) . 

// * NOTE; To generate entire N-length complex output spectrum, you can copy 
// * conjugate of element (i) to element (N-i) . 

// */ 

//float a[], w[] ; int n; { int aptr,bptr, wptr ; float half=0.5, 

// AR,AI,BR,BI, /* input values for A,B*/ 

// PR, PI, SR, SI, DR, DI, /*temporary differences , sums , products*/ 

// K,L,M,N, /*temporary products */ 

// ER,EI,ERD,EID, 

// FR,FI,FRD,FID, 

// WR,WI; 

///*We do first and last elements as special case(lmag=0, W=:(1,0))*/ 

// AR = a[0] ; AI = a[l] ; 

// a[0] = AR + AI ; a[l] = 0 ; 

// a[n] = AR - AI ; a[n+l] = 0 ; 
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//for (aptr=2, bptr=(n-2) , wptr=2 ; aptr < n/2 ; aptr +=2, bptr -=2, wptr +=2) 
//{WR = w[wptr] ; WI = w[wptr+l] ; 

// AR = a[aptr] ; AI = a[aptr+l] ; 

// BR = aibptr] ; BI = a[bptr+l] 

// /* aptr =2. 4, 6..., 14; bptr=:30,28,26, . . . ,18 (if n=32) */ 

// /* Note that there is no need to revise the value at the middle of the 
// list, as it is already correct. ( .5* (H(n/4)+Hconj (n/4) ) */ 

// SI = (AI + BI) ; 

// DR = (BR - AR) ; 

// K = WR*SI; L= WI*DR; PR = K-L ; 

// M = WR*DR; N= WI*SI ; PI = M+N ; 

// SR = (AR + BR) ; 

// DI = (AI - BI) ; 

// ERD = SR+PR; ER = half *ERD ; 

// a[aptr] = ER ; 

// EID = DI+PI; El = half^EID; 

// a[aptr+l]= El ; 

// FRD = SR-PR; FR = half*FRD ; 

// aibptr] = FR; 

// FID = PI-DI; FI = half^FID; 

// a[bptr+l]= FI ; ] /*end of for-loop */ I 

Qf C-CodC fOr peal flX *********************** * 

.text 

.align .quad 
// 

define (astart , rl6) //input data base address 

define (wptr, rl7) // pointer to W table. Because w-contents depend on N, 

// we will assume the caller has initialized w() array, 
define (N,rl8) // 

define (aptr, r20) //pointer to 1st component of butterfly (load) 
define (bptr, r21) //pointer to 2nd component of bfly (load) ; DOWNCOUNTER 

define (decrem,r24) //bla decrement 
define (count ,r25) // bla counter 


define (WR, fl8) //W (twiddle factor), real part 
define (WI, fl9) // " " , imag 


define (AR, fl2) 
define (AI, fl3) 
define (ARo,f 14) 
define(AIo,fl5) 
define (BR, fl6) 
define (BI, fl7) 


//element A, real component 
// " ", imag 

// extra A value, for prefetch (o="odd") 
//element B, real component 


define (ER, f20) //Result of butterfly which overwrites AR 
define (El, f21) // " " " " AI 


define (half ,f22) //constant 0.5 

define (FR, f24) //Result of butterfly which overwrites BR 
define (FI, f25) 
define (PR, f 26) 
define (PI, f27) 


define (DR, f28) 
define (DI, f29) 
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define (SR, f30) //Sum of A+B, real part 
define (SI, f31) 1 1 " ", imag " 

.data 

•align .double 
halfloc:; .float 0.5 

// 

.text 

.align .quad 
_realfix_ ; ; 

fst.q fl2,-16(sp)++ //save "local" regs 
adds -4,rO,decrem //bla decrement 

// 

// We do not bother to initialize FP pipes to zero here, as we assume 
// this routine is called after another, "safe" , pipelined FP routine. 

pfld.l halfloc, fO 

pfld.d 8( wptr)++,fO //skip W(0) intentionally. Is a trivial (1,0) value 
// init pointers: 

adds 0,astart ,aptr 

pfld.d 8( wptr)++, fO 

shl 2,N,bptr //bptr=total # bytes of input data 

pfld.d 8( wptr)++,half //0.5 into an fpr 
adds bptr,astart ,bptr // bptr points to a(N) 

// here fetch first set of A,B,W before bla-loop 
pfld.d 8( wptr)++,WR 

fld.d 0 (aptr) ,AR //for 1st and last elements 

adds -8, N, count // bla counter (predecrement by 2 butterflies worth) 

// 

// Do n/4 butterflies; (computing only N/2 elements of complex output, because 
// the second N/2 are just complex conjugates of the 1st N/2) 

// Definitions for pipe diagram; 

// WR = cos() , WI=-sin() . 

// DR = BR - AR ; (dif fence of Real components of A,R) 

// DI = AI - BI ; (diffence of Imag components) 

// SR, SI = sum of A,B 
// PR = K - L; where K= WR*SI, L=WI*DR 

// PI = M + N; where M= WR*DR, N=WI*SI 

// (ER,EI)=complex result to overwrite A. 

// (FR,FI)=" " " " B. 

first.fly;: //fill pipe. 

// For 0th butterfly; 

// AR = alO] ; AI = a[ll ; 

// a[0l = AR + AI ; a[l] = 0 ; 

// a(nl = AR - AI ; a[n+l] = 0 ; 


r2pt.ss fO,fO,fO 

// 

// 

KR..KI..M1.. 

0 0 

• • M2 • • 

..M3 

T AI.. 

• • A2 • 

. . .A3. 

...Write 

mrmlp2.ss AR,AI,fO 

// 

0 

0 

- 

ERO 

- 

- 

- 

mrmls2.ss AR,AI,fO 
fld.d 8 (aptr)++,AR 
fld.d -8 (bptr) ++,BR 

// 

0 

0 

0 

FR 

ER 



d.pfadd.ss fO,fO,fO 

// 

0 

0 

0 

0 

FR 

, ER 

- 

d.pfadd.ss fO,fO,ER 

// 

0 

0 

0 

0 

0 

FR 

ERO 
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1 

3 


d.ralp2.ss AI ,BI ,FR // 

. 

0 

0 


SIl 



FRO 

nop 

d.mrmls2.ss BR ,AR ,EI // 

_ 

_ 

0 

_ 

DRl 

SIl 


EIO 

fst.d ER,-8{aptr) 
d.mr2pt.ss WR ,fO, FI // WR 

. 

_ 

_ 

_ 

_ 

DRl 

SIl 

FIO 

fst.d FR, 8{bptr) 
d.ralp2.ss BR ,AR ,SI // 

K1 

. 


. 

SRI 

a. 

DRl 

SIl 

andh 0x8000, count , rO //check for negative 







d.ml2tpm.ss WI ,DR ,DR // 

LI 

K1 

- 

- 

- 

SRI 

- 

DRl 

bnc endfix 

d.r2pt.ss half, DR, fO //half 

Ml 

LI 

K1 




SRI 


nop 

d.iiil2ttpa.ss WI ,SI ,SR// 

N1 

Ml 

LI 

K1 




SRI 

nop 

d.i2st.ss fO ,f0 ,f0// fO 


N1 

Ml 

K1 

PRl 


_ 


nop 









// KR..KI 

..Ml... 

.M2... 

.M3 

T 

AI., 

. . A2 ... 

A3. . . 

.Write 

d.ratls2.ss AI ,BI ,f0 // 

- 

- 

N1 

Ml 

Dll 

PRl 

- 

- 

nop 

d.i2pt.ss fO ,f0, fO// fO 

_ 



Ml 

PIl 

Dll 

PRl 


fld.d 8 {aptr)++,AR 
d.r2apl.ss SR ,f0, PR// 




_ 

ERD 

PIl 

Dll 

PRl 

fld.d -8{bptr)++,BR 
d.rals2.ss SR ,PR, DI // 

_ 

_ 

. 

_ 

FRD 

ERD 

PIl 

Dll 

pfld.d 8( wptr)++,WR 
d.r2apl.ss DI ,fO, PI// 





EID 

FRD 

ERD 

PIl 

nop 

d.rals2.ss PI ,DI ,fO // 

ERl 



_ 

FID 

EID 

FRD 

_ 

nop 

d.ralp2.ss fO ,fO ,fO // 

FRl 

ERl 




FID 

EID 


nop 

d.rals2.ss fO ,fO ,fO // 

Ell 

FRl 

ERl 

_ 



FID 


bla decrem, count , fix_loop 
d.pfadd.ss fO ,fO ,FI // 

Eli 

FRl 

ERl 

_ 

_ 


-FID 


nop 









// 









// Each butterfly = 1 complx multiply. 

3 complx add, 1 

real multiply 



II- 8 multiply, 10 add/subtract 








// 3 8-byte fetches (A, 

B, W) 








// 2 8-byte stores (E, 

// 

F) 








// approx. 18 cycles per butterfly 









2 
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fix.loop;: // KR. 

.KI..M1.. 

. .M2. . 

..M3 

T 

AI... 

.A2.. 

. . A3 ... 

.Write 

d.mr2pt.ss fO ,FI ,ER // 0 

FIl 

Ell 

FRl 

- 

- 

- 

- 

ERl 

nop 

d.mrmlp2.ss AI ,BI ,FR // 


FIl 

Ell 


SI2 



FRl 

nop 

d.inrmls2.ss BR ,AR ,£I // 



FIl 


DR2 

SI2 


Ell . 

fst.d ER,-8(aptr) 
d.mr2pt.ss WR ,fO, FI // WR 






DR2 

SI2 

FIl 

fst.d FR, 8(bptr) 
d.ralp2.ss BR ,AR ,SI // 

K2 




SR2 

Mi 

DR2 

SI2 

andh 0x8000, count ,rO //check for negative 







d.ml2tpin.ss WI ,DR ,DR // 

L2 

K2 

- 

- 

- 

SR2 

- 

DR2 

bnc endfix 

d.r2pt.ss half, DR, fO //half 

M2 

L2 

K2 



^ ' 

SR2 


nop 

d.nil2ttpa. ss WI ,SI ,SR// 

N2 

M2 

L2 

K2 




SR2 

nop 

d.i2st.ss fO ,fO ,fO// 

fO - 

N2 

M2 

K2 

PR2 




nop 









// KR. 

.KI..M1.. 

..M2.. 

..M3 

T 

AI... 

. A2 . . 

. .A3. . . 

.Write 

d.ratls2.ss AI ,BI , fO// 

- 

- 

N2 

M2 

DI2 

PR2 

- 

- 

nop 

d.i2pt.ss fO ,fO, fO// 

fO - 



M2 

PI2 

DI2 

PR2 


fld.d 8 {aptr)++,AR 
d.r2apl.ss SR ,fO, PR// 





ERD 

PI2 

DI2 

PR2 

fld.d -8(bptr)++,BR 
d.rals2.ss SR ,PR, DI// 





FRD 

ERD 

PI2 

DI2 

pfld.d 8( wptr)++,WR 
d.r2apl.ss DI ,fO, PI// 



* 


EID 

FRD 

ERD 

PI2 

nop 

d.rals2.ss PI ,DI ,fO // 

ER2 




FID 

EID 

FRD 


nop 

d.ralp2.ss fO ,fO ,fO // 

FR2 

ER2 




FID 

EID 


nop 

d.rals2.ss fO ,fO ,fO // 

EI2 

FR2 

ER2 




FID 

_ 

bla decrem, count ,fix_loop 
d.pfadd.ss fO ,fO ,FI // 
nop 

// 

endfix: : 

// restore regs 

fiadd.ss fO,fO,fO //exit DIM 

EI2 

FR2 

ER2 





FID 

fld.q 0(sp),fl2 









fiadd.ss fO,fO,fO //last DIM pair 








adds 16,sp,sp 









bri rl ' 









nop 









// 
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PROGRAM FFTTEST 
c file = real.f 
C 

C 1-D FFT TEST PROGRAM 
C 

C 8/14/89 

C Intel assumes no responsibility for use or misuse of this code. 

C 

PARAMETER (IREV=1) 
character*8 really 
PARAMETER (REALLY= * real ' ) 
c PARAMETER ( REALLY:: * complex* ) 

PARAMETER (TIMEIT=0, CACHETIME=0) 

c REALLY::* real * means real-only input, otherwise assume complex input 
DATA IT/200000/ 
c PARAMETER ( N=2048 , M=ll ) 

PARAMETER (N=1024,M=10) 
c PARAMETER (N=512,M= 9) 

c PARAMETER (N=256,M= 8) 

c PARAMETER (N=128,M= 7) 

c PARAMETER (N=64,M= 6) 

c PARAMETER (N=32,M= 5) 

c PARAMETER (N=16, M=4) 

PARAMETER (PI=3. 1415926536) 

COMPLEX X2(N) ,X(N) ,X3(N) , W(N/2) 

Real ASQR(N) ,ASQR2(N) ,XR(N+2) ,XRl(N+2) ,XR2(N+2) ,XR3(N +2) 
complex wtemp 
real rtemp 
C 

PRINT *,* FFT test program ....* 

print * , ' =============================== * 

IF (IREV .eq. 0) THEN 

print *,*NOT counting time for bit-reversal.* 

print *,*D0 NOT expect matching answers, without bit-rev* 

ELSE 

print *, *Time for bit-reversal included.* 

ENDIF 

print *, *Time for cache writeback and fills...* 

IF (CACHETIME .eq. 0) THEN 
print *,* NOT included, if iterating.* 

ELSE 

print *,* ... included.* 

ENDIF 

print *, *===============================* 

print *, *If iterating... Number of Iterations =*,IT 

print *, *Number of Points = *, N 

print * , * ( * , REALLY, * data) * 

print *, *===============================* 
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c 

C Init twiddle factor array w(k) with (cos, -sin) of 2pi*k/N 
rtemp = 2.0*pi/N 

wtemp= CMPLX (cos (rtemp) , -sin(rtemp)) 
w(l) = (1.0, 0.0) 

DO 200 k = 2,N/2 

200 w(k) = wtemp * w(k-l) 

cc print *,* W (twiddle) initialization completed......* 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
C INITIALIZE input data 
C 

DO 100 I = 1, N 
c: constant: 
c Treal = 1.0 

c Timag =0.0 

c ;squarewave ; 
cc IF (I .It. N/2) THEN 
cc Treal = 1.0 

cc Timag = 0.5 

cc ELSE 

cc Treal = 0.0 

cc Timag = 0.0 

cc ENDIF 

C; ramp function: 

Treal =1-1.0 
Timag = Treal +0.5 
IF (REALLY .ne. »real») THEN 

X(I) = CMPLX (Treal, Timag) 

X2(I) = CMPLX (Treal. Timag) 

X3(I) = CMPLX (Treal, Timag) 

ELSE 

X(I) = CMPLX (Treal, 0.0) 

X2(I) = CMPLX (Treal, 0.0) 

XR(I) = Treal 
XRl(I) = Treal 
XR2(I) = Treal 
XR3(I) = Treal 
ENDIF 

100 CONTINUE 
C 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
CALL fft (X2, M, N) 

cc Subroutine fft is Decimation-In-Time, Fortran version. . 

CALL dirr(XR,M,N,W,l) 

c (Assuming dirr produces inplace result, items 0:N/2 complex results) 
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ccccccccccccccccccccccccccccccccccccccc 
IF (IREV .ne. 0) THEN 
IF (TIMEIT .eq. 0) THEN 
call vcompare (XR,X2,N/2+2) 
call cmags(XR,N/2+l,ASQR) 

c cmags to take squared magnitude of complex values in X 
call cmags (X2,N,ASQR2) 

c c 

C print non-zero results; 

J=0 

DO 700 I = l,N/2+l 

IF ((ASQR(I) .GT, 1,0) .OR. {ASQR2(I) .GT. 1.0)) THEN 
WRITE (6,22) (I-l), ASQR{I), ASQR2(I) 

22 FORMAT {' I-1=»,I4,* ASQR(I)= * ,F14.2, * ASQR2{I)= *,F14.2//) 
J = J+1 

IF (J .GT. 32) GOTO 725 
ENDIF 

700 CONTINUE 

725 CALL TIME 
ENDIF 
ENDIF 

IF (TIMEIT .ne. 0) THEN 
ccccccccccccccccccccccccccccccccccccccc 
cc- Timing loop follows; 

print Start Ass.FFT* 

IF (CACHETIME .eq. 0) THEN 
DO 500 1=1, IT, 4 

C Reuse same array,. so cache fill and writeback time NOT included. 
CALL dirr(XR, M, N,W,IREV) 

CALL dirr(XR, M, N,W,IREV) 

CALL dirr(XR, M, N,W,IREV) 

500 CALL dirr(XR, M, N,W,IREV) 

ELSE 

DO 504 1=1, IT, 4 

C Alternating between XR,XR1,XR2,XR3 should provide cache misses. 

CALL dirr(XR, M, N,W,IREV) 

CALL dirr(XRl, M, N,W,IREV) 

CALL dirr(XR2, M, N,W,IREV) 

504 CALL dirr(XR3, M, N,W,IREV) 

ENDIF 

print *,* END Ass. FFT* 
ccccccccccccccccccccccccccccccccccccccc 
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ENDIF 

STOP 

END 

c — — . — c 

subroutine vcompare (res,exp,n) 

c VCOMPARE compares 2 vectors, prints out 1st few miscompares 
c 

integer n, errcnt 
real res(n), exp(n) 

write (6,12) 

12 formate*** VCOMPARE; vector comparison beginning 

data errcnt/0/ 

do 30 i = l,n 

if (AINT (res (i) ) ,ne. AINT(exp(i) ) ) then 
c {print out error, exit if alot already} 


120 

print *,»*** Error in compares ***’ 
write (6,121) i 

121 

format (* Item number = *,I6) 
write(6,124) res(i), exp(i) 

124 

format (* Res«=* ,F14.2, * Expected_=* ,F14.2) 
errcnt = errcnt + 1 
if (errcnt .gt. 19) then 


return 


end if 

end if 


30 continue 


if (errcnt .eq. 

0) then 

190 print *,» *** 

vector compares SUCCESSFUL **** 

end if 


99 return 


end 
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c 

C filer fft.f 


C FFT routine from Rabiner & Gold, 1975, who copied it 
C from Cooley, Lewis, Welch 
C 6/02/89 
C 

C Decimation in Time, radix-2, inplace, 1-dimen 
C Inputs; 

C As complex array of input, up to 1024 pts, single-prec float 
C (maybe more than 1024, uncertain what limit is) 

C M= log of number of pts 
C = (Number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 
C 

C Outputs; 

C As complex fft of input A, in NON-bit-reversed order. 

C 

C w (twiddle factor) calculated by recursion. Supposedly takes 15% more 
C operations than keeping entire twiddle array as constants pre-allocated. 
C 

subroutine fft(a,m,n) 

integer m,n, i, j,k, ndiv2,powers2(0 ;10) 
integer iplus, offset, stage, indexl, groups 
complex a(n) ,wtemp(2) ,w(ll) ,temp 


C 


Init twiddle factor array w() with (cos, -sin) 


data 

data 

data 

data 

data 

data 

data 

data 

data 

data 

data 


w(l) /(-I. 0,0.0) / 
w(2) /(0.0,-1.0) / 
w(3) /(O. 7071068, -0.7071068)/ 
w(4) /( 0.9238795, -0.3826834)/ 
w(5) /(O. 9807853, -0.1950903)/ 
w(6) /(O. 9951847, -0.0980171)/ 
w(7) /{O. 9987955, -0.0490677)/ 
w(8) /(O. 9996988, -0.0245412)/ 
w(9) /(O. 9999247, -0.0122715)/ 
w(10) /(O. 9999812, -0.0061359) / 
w(ll) /(O. 9999953,-0.003068) / 


of pi , pi/2, pi/4, .. . 


data powers2 /I, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024/ 
C Powers2 to avoid calls to POW, DIV 


C Setup for bit-reversal loop; 
ndiv2 = n / 2 
j =1 


C "DO 7" loop to in-place-bit-reverse-shuffle input 
DO 7 i= 1, n-1 

IF (i .It. j) THEN 
temp = a(j) 
a(j) = a(i) 
a(i) = temp 

ENDIF 
k = ndiv2 
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C "While (j .gt. k) " /^decrease j by 2**something */ 

6 IF (j .gt. k) THEN 

j = j-k 
k = k / 2 
GOTO 6 

ENDIF 

C Add next lower power of 2 to j 

7 j = j+k 

C 

C Special case for stage 1: no complex multiplies, simple add 
C (Performance enhancement) 
groups = 2 
offset = 1 
indexl = 1 

C i-loop iterates N/2 times for 1st stage (and would do twice N/4 x for 2nd) 
CVD$ NODEPCHK 

DO 8 i = l,n,2 
iplus =1+1 

temp = a (iplus) 
a (iplus) = a(i) - temp 

8 a(i) = a(i) + temp 

C 

C Special case for stage 2: no complex multiplies, simple add 
C (Performance enhancement) 
groups = 4 
offset =2 
indexl = 1 

C i-loop iterates N/4 times for 2nd stage 
C 1st call to 1-loop, in stage2: lndexl=l, wtemp(l)=:(l,0) 

CVD$ NODEPCHK 

DO 90 i = l,n,4 
iplus = i + 2 
temp = a (iplus) 
a (iplus) = a(i) - temp 
90 a(i) = a(i) + temp 

indexl = 2 
CVD$ NODEPCHK 

CVD$ NOVECTOR 

DO 92 i = 2,n,4 

iplus = i + 2 

temp = CMPLX(AIMAG(a(iplus)),-REAL(a(iplus))) 
a (iplus) = a(i) - temp 
92 a(i) = a(i) + temp 
CVD$ VECTOR 

C 

C "DO 20" stage-loop executed once for each of the (m) stages of FFT 
C (Except 1st and 2nd stage) 

C offset gets 4,8,16,32,64,128,256... 

DO 20 stage = 3,m 

groups = powers2( stage) 

offset = groups/2 

wtemp(l) =(1.0, 0.0) 

C One twiddle seed (W) calc per stage. 

C We pre-allocated w(12) -array with those values, avoid cos/sin calls 
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c 

DO 20 indexl = 1, offset. 

C "DO 10" i-loop does each butterfly of each stage, with varying twiddles 
C i-loop iterates N/2 times for 1st stage, N/4 x for 2nd, N/8 x for 3rd 
C stage, N/16 x for 4th stage,.,. 1 time for last stage. 

CVD$ NODEPCHK 

CVD$ ALTCODE 

DO 10 i = indexl, n, groups 
iplus = i + offset 
temp = a (iplus) * wtemp(l) 
a (iplus) = a(i) - temp 
10 a(i) = a(i) + temp 

20 wtemp(l) = wtemp(l) * w( stage) 

RETURN 

END 

C 

subroutine cmags (a,n,asqr) 

C Complex magnitude squared. 

C Inputs; 

C A=: complex array of input, single-prec float 
C N = number of input points (and output points) 

C Ouput ; 

C asqr = real squared magnitude (R*R + 1*1) , N elements, single-prec float 

integer n,i 
real asqr{n) 
complex a(n) 

DO 100 i = 1, n 

asqr(i) = (REAL(a(i) ) *REAL(a(i) ) ) + {AIMAG(a(i) ) *AIMG(a(i) ) ) 

100 CONTINUE 

RETURN 
END 
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## makefile for i860{tm) CPU FFTs (for Unix V/386 programming environment) 
## 8/7/89 

## 

GH=/usr/i860/bin 

GHL=/usr/i860/lib 

CC=$(GH)/c860 

FC=${GH)/f860 

CFLAGS= -OLM -X393 -X405 -X188 -X370 

FFLAGS= -OLM -X370 -X393 -X71 -X422 
## -X71 uses single-precision math routines 

FLFLAGS= -Mx map -e start 

LFLAGS= -Mx map -e _main 
CLIB=$(GHL)/libc.a 
MLIBPSR=$ (GHL) /860mtlib.a 

MLIB=!|(GHL)/libm.a 

FLIB=|(GHL)/libf.a 

ASM=$(GH)/as860 

FLINK=$(GH)/ld860 ${FLFLAGS) 

RT=${GHL)/s51ib.a 

LIBS= KFLIB) KMLIBPSR) |(MLIB) $(CLIB) |(RT) 

LIBCC= ${MLIB) KCLIB) $(RT) 

## NOTE; Order of linked files is CRUCIAL, other orders may give errors 
. SUFFIXES ; 

.SUFFIXES; .f.c.s.ss.o.8 
. IGNORE ; 

## .ignore causes make to ignore error codes from compilers 

## To test Fortran plus assembler-fft-stage version; 

FILE=: ffttest.o fft.o diff.o bitrev.o difstep.o start. o time.o 

## To test all-Fortran version of f ft ; 

##FILE= ffttest.o fft.o diff.o difstepf.o start. o time.o 

## To test REAL-input version of f f t ; 

RFILEs real.o fft.o dirr.o realfix.o difstep.o bitrev.o start. o time.o 
.f .0 ; 

$(FC) $(FFLAGS) $*.f 
$(ASM) -X -0 $*.o $*.s 

.c.o ; 

$(CC) KCFLAGS) ^*.c 
$(ASM) -X -0 $*.o $*.s 
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//start .ss 
it 8/18/89 

/ / Fortran runtime startoff routine 

// 

.text 

.globl start 
.globl finish 
start ; ; 

0 rh h%_s t ack+262128+262144 , rO , sp 

or l%.stack+262128+262144,sp,sp 

adds -16,sp,sp 

st.l rl,12(sp) 

call _main 

nop 

finish;; 

call _exit 

nop 

.file "start. c" 

.data 

.align, .quad 

.Icomm _stack, 262144+262144 
.end 

//=:========:=:=======:=:===:===:=====z:=====:==:===:===:====^^^ 

/* file; time.c. Purpose; establish a label to use for breakpoints */ 
long time.(x) 

long *x; 

{ X = x+4 ; 

return { (long) x) ; 

1 

long timestop«{x) 

long *x; 

{ X = x+4; 

return{ (long) x) ; 
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1.0 BACKGROUND 

The Intel 82495 Cache Controller and 82490 Cache 
RAM form a high-speed cache subsystem for the 
Intel486 DX CPU (82495DX/490DX) or the i860 XP 
CPU (82495XP/490XP). The reader should be familiar 
with these chips, as described in: 

1) i860 XP CPU Microprocessor Data Sheet (Intel or- 
der #240874) 

2) Intel486 DX Microprocessor Data Sheet (Intel order 
#240440) 

3) 82495XP Cache Controller/82490XP Cache RAM 
Data Sheet (Intel order #240956, June 1991) 

or Intel486 DX CPU Microprocessor Cache-Chip 
Set Data Sheet (Intel order # 241084, June 1991) 

Diagrams of systems containing the 82495 and 82490 
appear in Figure 1, and a more detailed diagram of the 
CPU/82495/82490 core appears in Figure 2. (Note: for 
simplicity, the 82495XP/82490XP and 82495DX/ 
82490DX will be referred to generally as 
82495/82490 — the XP or DX should be inferred de- 
pending upon the CPU being utilized.) In such systems, 
the 82495 controls a cache external to the CPU, and 
includes the cache tags. It can interface gluelessly to an 
Intel486 DX CPU or i860 XP CPU microprocessor. 


allowing the processor bus to run at 50 MHz with zero 
wait-states, while the memory bus can remain at a low- 
er frequency. Both writeback and writethrough proto- 
cols are supported. Concurrent operations can occur 
simultaneously on the local CPUbus and the shared 
memory bus. All requisites for multiprocessors are in- 
cluded in the 82495, Intel486 DX CPU, and i860 XP 
CPUs, but the 82495 also is useful for a uniprocessor 
system performance enhancement. 

The 82490 cache RAM contains 32 kBytes per chip, 
and is used in groups of 4, 8, or 16 to implement caches 
from 128 to 512 kBytes. It supports two-way associativ- 
ity, delayed writebacks, burst transfers, and boundary 
scan test. The 82490 contains much more than RAM 
cells — it includes various buffers, queues, and support 
for several bus protocols. It is two-ported, with simulta- 
neous access on both the CPU side and Memory-Bus 
side. The cache optionally supports parity using addi- 
tional 82490 chips. 

Configuration options allow a variety of memory bus 
widths (32 to 144 bits), cache line widths (16 to 128 
bytes), and asynchronous or synchronous transfers. 
The configuration is selected by the polarity of various 
pins at reset time. 


1. Uniprocessor 


1486^^ or 
i860'^^ XP 
CPU 




32, 64, or 128 


240957-1 



2. Homogeneous Multiprocessor 



240957-2 


3. Heterogeneous Multiprocessor 



Figure 1. CPU + 82495 + 82490 Systems 
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Figure 2. CPU + 82495 + 82490 Core 


The Memory Bus Controller (MBC) portion of the sys- 
tem interfaces the 82495 and CPU to the system bus. 
The MBC converts bus status and command lines into 
requests to the 82495, for example, to monitor the prog- 
ress of an ongoing bus transaction from another CPU 
subsystem to ensure consistency with 82495 + 82490 
cache contents. Likewise the MBC adapts 82495 re- 
quests to the bus protocol and arbitrates for ownership 
of the bus. Most CPU requests will not require MBC 
action; only I/O cycles, cache bypass requests, and 
82495 cache misses are forwarded by 82495 to the 
MBC, while external cache hits are handled totally by 
82495 + 82490. 


2.0 WHY A CUSTOM BUS 
INTERFACE? 


A large multiprocessor of 6 or more CPUs needs a wide 
and fast bus such as Futurebus + , with split-transac- 
tion capability to prevent bus bottlenecks from slowing 
the performance of every processor. Hierarchies of bus- 
es and caches can further allow more CPUs with rea- 
sonable performance increases as CPUs are added. A 
Futurebus+ hierarchy maintains concurrent transac- 
tions on each bus, and “bridge” caches at the junctions 
of buses echo them from bus to bus when the bridge 
detects that one transaction may affect cached copies 
on the other bus. 



Compatibility with existing buses is often crucial in 
product design, so that new faster components can plug 
into existing machines and I/O devices. The flexible 
82495/82490 bus interface allows compatibility as well 
as extension. 


Clearly the entire interface to a memory bus (abbreviat- 
ed M-bus) could have been incorporated in the 82495 
and 82490 chips. This approach has been followed by 
some other cache chipsets. 

However, such integration suffers from inflexibility and 
bandwidth limitations. As shown in Figure 3, the per- 
formance and cost targets of the system determine the 
size and complexity of the bus, so if the bus is “hard- 
wired” into the cache controller chip, it will be too 
costly for small systems and too slow for larger sys- 
tems. With the bus interface implernented separately, it 
can be a complex ASIC for a high-bandwidth complex 
system, or a few EPLDs for a PC. The same cache 
controller can improve performance of a variety of bus- 
based CPUs. 

For a desktop PC, a 32-bit simple memory bus is ade- 
quate. For a workstation or small multiprocessor of 
two CPUs, a faster 64-bit bus may be required to give 
adequate bandwidth for graphics frame buffers and in- 
tensive numeric calculations. Bus bandwidth require- 
ments grow as the MIPS rating of each CPU in a sys- 
tem grows; for example, a bus adequate for 12 386 
CPUs may be too slow for 6 Intel486 DX CPUs, as 
they process far more data per second. 


Thus the 82495 and 82490 will be used in a wide variety 
of systems, including standard buses like Futurebus + . 
For proprietary buses, the “proprietor” can design an 
ASIC or PAL MBC incorporating the required fea- 
tures. 


3.0 GUIDELINES 

This document exists to clarify the necessary compo- 
nents and tradeoffs of a Memory Bus Controller. The 
example designs here have not been tested, and signal 
definitions of the i860 XP CPU, Intel486 DX CPU, 
82495, and 82490 chips are subject to change. 

The memory bus controller is not allowed to use (and 
thus add capacitance to) any of the CPU pins used by 
the 82495/82490, except those listed in the 82495 Data 
Sheet [82495/490DS] description of the BLE# pin. 
Only the CPU pins BE7-0#, PWT, PCD, LEN, 
CACHE#, BRDY#, PCYC, and CTYP have suffi- 
cient timing margin to tolerate the MBC load. 
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Feature 


CPU<->Memory 

Interconnect 

Bus Width, 
Frequency 


Arbitration 

LOCKing 

Error Detect 

Bus Protocol 

Extra 

Features 


Uniproc 


Simple 

Bus 

32 bit 
20-40 MHz 


WriteThru 


Small I Medium 
2-3 CPUs I 4-8 CPUs 
Multiprocessor 


Large 
8+ CPUs 


I 

Pipelined Bus 


32 or 64 


64 or 128 
- 33 MHz or more 


Crossbar or 
Bus Hierarchy 

64 or 128 bit 


WriteBack 


- 82495 + 82490 


• 3rd Level Cache 


Central | Central 
(HOLD/HLDA) 

— Bus Lock 

I 


Distributed Arbitration 
Bus Parking 

I 

Address Lock — ^ 


Parity 


Simple 


Pipelined 


ECC on Memory, Retry 

I 

• Split Transaction ■ 


Read-for-Ownership 
Cache-to-Cache Transfers 
External FIFOs 


Figure 3. System Type and Bus Requirements 


Shared Bus Interconnect 

When used in a multiprocessor, the 82495 assumes a 
shared-memory, shared-bus environment so that it can 
observe and “snoop” accesses by others which might 
conflict with the memory locations it has cached. In a 
crossbar or other multipath interconnect, shared-bus 
coherency can be emulated for the 82495 or it can be 
used non-coherently. Either a centralized directory or a 
hierarchy of buses and caches can do the emulation. A 
directory would keep a record, for each line of main 
memory, of caches which have the line. When a cache 
first writes to a line of memory, the central directory 
broadcasts an invalidation message to all other caches 
containing that line. [Agarwal88] 


4.0 MBC BLOCK DIAGRAM 

Shown in Figure 4 is a high-level block diagram of the 
functions and interfaces involved in the Memory Bus 
Controller. Part of the MBC operates on the high-speed 
clock (CLK) which the CPU and 82495 use. While the 
M-bus could use the 50 MHz CPU CLK, such a fast 
M-bus is hard to design. The part of the MBC which 
interacts with the memory bus protocol runs on an M- 
bus clock (MCLK), if that protocol is clocked. Also 
possible is an unclocked M-bus protocol using the 
82495/82490 in “strobed” mode. The MBC contains 
synchronizers and a few signals which cross between 
the two clock domains. Synchronizers, consisting of 
specially-designed flip-flops, allow a clocked state ma- 
chine to use data which may be transitioning near the 
edge of the clock. Unsynchronized data can cause 
metastability in latches, where their output changes 
slowly and unpredictably. 
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CDTS# CW/R, 

CADS# CD/C, CM/10 



240957-6 


Figure 4. Generic Biock Diagram of MBC 
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5.0 DESIGN EXAMPLE: A 
UNIPROCESSOR MBC 

A simple MBC design example is an adapter to allow 
plugging a daughtercard module with an Intel486 DX 
CPU, 82495, and 4 82490s into an Intel486 DX CPU 
microprocessor PGA socket. The memory bus is an 
Intel486 DX CPU-bus, allowing the external cache to 
be a performance enhancing option. It assumes a “di- 
vided synchronous” M-bus clock, where the M-bus 
runs at the CPU CLK speed. Thus no synchronizers 
are needed. The MBC uses both the CPU CLK and the 
M-bus MCLK. 

This design requires 

• 1 74F377 latch 

• 6 PLDs containing 10 state machines 

• 2 chips for clock generation, not part of the MBC 

Approximately 70 signal pins connect the MBC block 
to the CPU, cache, and memory. Only a uniprocessor is 
supported, although the bus protocol and MBC could 
be enhanced for multiprocessing coherency. Figure 5 
shows a block diagram. Details of the design can be 
found in Appendix B. 


6.0 DESIGN EXAMPLE: A 
MULTIPROCESSOR MBC 

An i860 XP CPU multiprocessor-capable MBC (Figure 
6) using an M-bus similar to the i860 XP CPU bus is 
proposed. For clocking, it uses an MCLK of 33 MHz, 
totally asynchronous to the 50 MHz CPU CLK. It 
could therefore be upgraded to faster CPU CLK rates 
in the future without changing the design or M-bus. 

The design requires: 

• 2 74F377 octal latches (for BE7-0#, etc...) 

• 2 74AS4374 dual-rank-synchronizer octal registers 

• 16 PLDs 

• 2 GAlllO clock drivers for clock distribution 

These components could be integrated into a single 
ASIC chip, as about 120 signals connect to the MBC. 
The MBC can be used for a uniprocessor or multipro- 
cessor i860 XP CPU design. Details can be found in 
Appendix C. 
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Figure 6. Block Diagram of Multiprocessor MBC 


2-455 




AP-452 


iny* 




7.0 MBC FUNCTIONS 

Table 1 shows the responsibilities of the Memory Bus 
Controller for uniprocessors and multiprocessors (MP). 
The multiprocessor features exist mainly to prevent bus 
over-utilization. However, some of the jobs common to 
both are more complex in MP for example, arbitration 
and snooping. The pin lists in the table are not exhaus- 
tive. 


MBC Functions for Uni and 
Multiprocessors 

Reset and configuration control includes strapping of 
the following pins to resistors at Vcc or Ground, or 
“temporary strapping” of multifunction pins whose 
state during the last 16 clocks before falling edge of 
RESET determines 82495, 82490, or CPU configura- 


Table 1. Functions of the Memory Bus Controller 


MBC Functions for Uni and Multiprocessors 

1 . RESET and Configuration 

2. FLUSH# and SYNC# 

** 3. Bus Error Detection, Retry 

4. CPU transfer tracking (burst count) 

5. Mbus transfer tracking (burst count) 

(including writeback, allocation) 

6. Synchronization between clock domains 
** 7. Memory-bus pipelining 

** 8. MBC-to-82495 pipelining 
9. Memory Bus Arbitration 
10. Cacheability decode 

**11. Redrive bus signals for BTL or ECL levels or heavy capacitive loads 
**12. Packing (convert 32-bit M-bus for 64-bit 82490 size, or 8-bit ROM) 
**13. Bus messages (interrupts, flushes) 

* * 1 4. Boundary scan and selftest 

**15. Performance monitoring (M-bus utilization, read vs. write) 

16. Snoop handshake (snooping DMA or other CPU) 

1 7. Snoop writebacks 


Pins 

RESET, HOLD,CAHOLD 
CAHOLD,FSIOUT#,FLUSH#,SYNC# 
PCHK#,BERR 
CLEN1 :0,RDYSRC,BRDY# 
CRDY#,MBRDY# 

BGT#,CADS#,MBRDY# 

BGT#,CNA#,MEOC#,CRDY# 

CNA#,MALE 

BGT# 

KWEND#,MKEN# 


MBRDY# 

INT(R),FLUSH# 

TCK,TMS,SLFTST# 

CW/R#,CADS# 

SNPSTB#,SNPCLK,SNPCYC# 

MHITM#,SNPADS# 


Additional MBC Functions for Multiprocessors 

Ml . Snoop window (as master) 

M2. Backoff 82495 when request was to M-line in another 82495 
**M3. Snoop filtering (via SMLN#) 

**M4. Cache-to-cache transfers (CTCT) 

**M5. Read-For-Ownership (RFC) 

**M6. Split transactions (requires duplicate tag array) 

**M7. Memory cycle abort (after MHITM#) 

M8. LOCK# protection 

**M9. LOCK# de-assertion (for back-to-back Intel486 DX CPU locks) 
**M10. CPLOCK# (Intel486 DX CPU only) 

* * M 1 1 . Snoop during LOCK # 

**M12. Multiprocessor Interrupts 

(for Message-Based Interrupts or TLB shootdown) 

** = optional and implementation dependent 


Pins 

SWEND#,MWB/WT# 
MAOE# 
SMLN# 
DRCTM#,MBAOE# 
PALLC # .DRCTM # ,MFRZ # 
CWAY 
MHITM# 

KLOCK # ,CAHOLD,SNPCYC # 
KLOCK# 
CPLOCK# 
KLOCK# 
INT,NMI(BERR) 
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tion. The circuit feeding RESET to these chips should 
keep it active at least 16 CLK periods. “Temporary 
strapping” means including RESET or A RESET in 
the logic equation for the pin. The multifunction pins 
are indicated with brackets [ ] below: 

i860 XP CPU pins: 

PEN#, FLINE#, HOLD 

Intel486 DX CPU pins: 

RDY#, BOFF#, BS8#, BS16#, HOLD, FLUSH 


input FLUSH #[NCPFLD#] should be driven high 
the same clock that RESET falls, to prevent an unnec- 
essary 82495 cache flush. In Intel486 DX CPU sys- 
tems, the 82495 input CACHE# must be tied low and 
HITM#[CPUTYPl must be tied LOW, as it signals 
CPUTYPE to 82495. 

82490: The 82490DX inputs HITM# and BOFF# 
must be tied high in an Intel486 DX CPU system, as 
they exist to support the i860 XP CPU writeback 
cache. With an i860 XP CPU, the 82490XP input 
BOFF# comes from 82495XP but HITM# from i860 
XP CPU feeds 82495XP and 82490XP. 


82495 pins: 

CFG3, CFG2[KWEND#],CFG1 [SWEND#], 
CFGO [CNA#], CPUTYP[HITM#], 

FPFLDEN [FPFLD # ] , NCPFLD # [FLUSH # ] , 
SNPMD[SNPCLK], C490LDRV [BGT#], 
MEMLDRV[SYNC#], SLFTST#[CRDY#], TEST, 
HIGHZ# [MBALE], CACHE# (NOTE: the 
FPFLDEN pin is defined for Intel486 DX CPU as 
PLOCKENLCPLOCK#]. The 82495XP does not use 
CFG3 for configuration in i860 XP CPU systems.) 


82490 pins: 

MTR4/TR8# [MSEL#], MX4/MX8# [MZBT#], 
MSTBM[MCLK], MEMLDRV[MFRZ#1 PAR#, 
MOCLK, (BOFF#, HITM#) 

Intel486 DX CPU: The “unused” Intel486 DX CPU 
inputs (RDY#, BS8#, BS16#, BOFF#) with 82495 
should be connected as described in the Intel486 DX 
CPU Chipset EDS. 


The 82490 input MOCLK must also be tied low or to a 
delayed version of MCLK, if clocked-M-bus mode is 
used. This is because the 82490 senses the state of 
MOCLK after RESET ends — if MOCLK stays low, 
the 82490 uses MCLK to drive MDATA. If MOCLK 
toggles after RESET, the 82490 will use MOCLK to 
switch output data. Using a delay-line externally to the 
82490 to generate MOCLK from MCLK allows the 
design a longer hold-time at other receivers of MDA- 
TA in the system. For a clocked-M-bus (non- synchro- 
nous to CLK), the undelayed MCLK should be con- 
nected to the 82495’s SNPCLK input and should be 
toggling during RESET to tell the 82495 to snoop in 
clocked mode. 



During RESET, the 82495 and 82490 will float the bi- 
directional lines they share with the CPU, such as 
CDATA and A31:A3. Thus driver contention is avoid- 
ed. The RESET input should be synchronous to CLK 
and deasserted to the 82495, 82490s, and CPU at the 
same time, to assure that the configuration controls get 
properly passed between them. 


The Intel486 DX CPU FLUSH# input should be tied 
up, unless the system requires FLUSH messages from 
the M-bus to be interpreted. Then the MBC must assert 
the FLUSH# inputs to both Intel486 DX CPU and 
82495, because 82495 does not do back-invalidates to 
the Intel486 DX CPU for FLUSH#. During RESET, 
the Intel486 DX CPU FLUSH# input must be kept 
high to avoid putting the CPU in tristate-output-test- 
mode (Intel486 DX CPU Data Sheet Section 8.4). 

i860 XP CPU: The i860 XP CPU input PEN# (Parity 
trap ENable) must be strapped high unless the memory 
data bus feeding the 82490s always contains good pari- 
ty and the i860 XP CPU system uses 2 82490s in parity 
mode; in the latter case, strap PEN# low. HOLD 
should be strapped low and FLINE# strapped high, as 
those features cannot be used with 82495. 

82495: The multiplexed 82495 pin FPFLDEN 
[FPFLD#] becomes an output after RESET, so the 
PAL or ASIC which creates FPFLDEN must float it 
as soon as RESET = 0. The same multiplexing applies 
to Intel486 DX CPU mode, where the pin is named 
PLOCKEN[CPLOCK#]. Likewise, the multiplexed 


For Intel486 DX CPU resets, refer to [82495/490DS] 
for the sequencing of HOLD, HLDA, CAHOLD, and 
RESET required to reset only the processor without 
destroying 82495 cache contents. For that purpose, a 
separate RESET line is advised for the CPU and 
82495/82490. The CPU RESET line must be wired to 
the WRMRST input of 82495, to force 82495 to assert 
the BRDYl # input to the CPU during a reset of CPU- 
only (the CPU uses the BRDYl # input during RESET 
to know of the 82495’s existance). The HOLD input of 
the Intel486 DX CPU and i860 XP CPU processors 
should be kept low during normal operation with the 
82495, because floating the processor outputs may yield 
undefined 82495 behavior. 

FLUSH# (and SYNC#) of caches requested by soft- 
ware must be decoded from the 82495 outputs CM/ 
lO#, CD/C#, and CW/R# ( = 001) and latched 
BE3-0# from the CPU. BE3-0# values of 0111 or 
1101 should activate the 82495 FLUSH# input, as the 
Intel486 DX CPU outputs them in response to the 
INVD and WBINVD instructions, respectively. Synch 
and flush commands may also come from the bus as a 
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message in a multiprocessor system. The 82495 is smart 
enough to allow assertion of FLUSH# or SYNC# at 
any time, and will delay the beginning of the flushing 
action until all current CPU and M-bus cycles have 
completed. The inputs are edge-sensitive. If the bus de- 
fines cache flush messages, the MBC may activate the 
Intel486 DX CPU FLUSH# input as well as the 
82495’s in response to bus message decodes. 

Bus Error or Timeout Detection logic in the MBC can 
use the CPU’s PCHK # output or other M-bus-specific 
signals to detect errors. Note that the assertion of 
PCHK# will occur near the time of the error on the 
M-bus ONLY for non-cacheable reads or 82495-cache- 
miss reads. For 82495-hits and CPU-idle cycles, 
PCHK# may arise due to a floating or erroneous CPU 
data bus value transferred on the M-bus much earlier. 
PCHK# must be ignored by the MBC except during 
the CLK after data transfer to the CPU was signalled 
by the MBC’s CPU BRDY#, because PCHK# indi- 
cates i860 XP CPU bus parity status at all times, not 
just during clocks of BRDY # activation. The proces- 
sor inputs INT, BERR, or NMI can be asserted by the 
MBC to signal errors. To detect errors originating in 
the CPU or 82490 upon a write(back), the MBC can 
check parity on the 82490 MDATA pins or on the M- 
bus. 

If the memory bus includes a retry protocol, the MBC 
bears the responsibility to implement it, because the 
82495 will not retry accesses. For a pipelined MBC in- 
terface when the retry occurs after CNA# to the 
82495, the MBC must latch the address and other con- 
trols (CW/R#, CM/IO#, etc...) from the 82495 to use 
in retries. Retry should be triggered by signals other 
than the CPU PCHK# output, because the CPU data 
transfer cannot be retried although the M-bus transfer 
can. 

The. 82490 can restart a burst data transfer (for the case 
of an error detected after the first MBRDY # but be- 
fore MEOC# and before CRDY#). To restart the 
82490, the MBC must deassert MSEL# for at least 1 
MCLK. 

While parity is supported by the 82495 and 82490, 
ECC (Error Correcting Codes) cannot conveniently be 
used within the cache. ECC can be implemented on the 
memory system, but no loads are permitted on the 
CPU-to-82495/82490 interface wires for error checking 
logic. 

Scenarios requiring MBC action are 

1) CPU based requests (“Master” mode): 

• 82495 cache read miss (and line fill) 

• 82495 cache write miss 

• Non-cacheable CPU read (including i860 XP CPU 
pfld) 

• Writethrough (to S-state line) or Non-cacheable 
CPU write 


• I/O reads and writes 

• LOCKed reads and writes (will be readthrough or 
writethrough) 

2) 82495 based requests (“Master” mode): 

• Allocation due to write-miss (line fill) 

• Replacement writebacks 

• SNPADS# writebacks 

3) Requests from other masters (“Slave” mode): 

• Snooping of DMA accesses 

• Snooping of accesses of other CPUs (in a multipro- 
cessor) 

• Bus-specific requests, like interrupt messages, reset 
requests, cache flushes, configuration registers, ID 
registers, timeout detection, acknowledgements, 
TLB shootdown 

Transfer Tracking 

Tracking of transfers on the M-bus and CPUbus is re- 
quired of the MBC during all of the above scenarios. 
This tracking (counting) of transfers involves activating 
BRDY # the correct number of times for the CPU and 
MBRDY # (a possibly different number) for the 82495 
and 82490. Transactions on the CPUbus which must be 
MBC-controlled can be 1, 2, or 4 data transfers, decod- 
ed from the BLE# -latched CPU pins: 

Intel486 DX CPU: BE3-0#, PWT, PCD 
i860 XP CPU: BE7-0#, PWT, PCD, LEN, 
CACHE# 

and from the 82495 pins CW/R#, MCACHE#, 
RDYSRC (and CLENLCLENO for Intel486 DX CPU 
mode). 

See [82495/490DS] for a complete definition of the en- 
codings. The BRDY # activations must be done only if 
RDYSRC = 1, and always correspond to the first 1, 2, 
or 4 MBRDY #s for the 82490-M-bus interface. The 
number of MBRDY #s always exceeds or is equal to 
the number of BRDY#s, even for a 128-bit M-bus. 

Bursts for line fills and writebacks on the CPUbus al- 
ways are 4 transfers, but with some 82495 configura- 
tions the M-bus is 8 transfers. The addresses are nonse- 
quential when the first access is not at the zeroth word 
of the line. The addresses corresponding to each 
BRDY# and MBRDY# follow these rules: 

1) CPU burst addresses wrap at CPU line length. 

2) When the line address is odd (A2 = 1 for 4-byte bus; 
A3 = 1 for 8-byte bus; A4= 1 for 16-byte M-bus), the 
next address transferred on CDATA and MDATA 
is the LOWER address (eg., 3 followed by 2). The 
odd-first-then-even pattern continues for all transfers 
of the burst. This order optimizes interleaved 
DRAM systems, and applies to both the M-bus and 
CPUbus. 
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3) 82490 bursts on CDATA wrap at CPU line length. 
82490 MDATA burst addresses wrap at 82490 line 
length. For example, a linefill with LR = 4 and a first 
Intel486 DX CPU address (A5:A2) = E, 


• 82490 CDATA ordering is E F C D 

• 82490 MDATA ordering is CDEF 89AB 4567 0123 
(128-bit M-bus) OR EF CD AB 89 67 45 23 01 (64- 
bit M-bus) 


For LR = 2 (Line Ratio of 82495 to CPU) and CPUbus width = M-bus, below are the burst orders. Each address 
corresponds to one 4-byte transfer (for Intel486 DX CPUs) or 8-bytes (for i860 XP CPU). Time is increasing left-to- 
right: 


First Address: 0 

CPU transfers: 0 12 3 

M-bus transfers: 0 1 2 3 4 5 6 7 


First Address: 1 
10 3 2 

1 0 3 2 5 4 7 6 


First Address: 2 

CPU transfers: 2 3 0 1 


First Address: 3 
3 2 10 


M-bus transfers: 23016745 

3 2 1 0 7 6 5 4 

First Address: 4 

First Address: 5 

CPU transfers: 4 5 6 7 

5 4 7 6 

M-bus transfers: 45670123 

5 4 7 6 1 0 3 2 

First Address: 6 

First Address: 7 

CPU transfers: 6 7 4 5 

7 6 5 4 

M-bus transfers: 6 7 4 5 2 3 0 1 

7 6 5 4 3 2 1 0 

■bus = 2* CPUbus width (both buses using 4 transfers), 


First Address: 0 

First Address: 1 

CPU transfers: 0 12 3 

10 3 2 

M-bus transfers: 01 23 45 67 

01 23 45 67 

First Address: 2 

First Address: 3 

CPU transfers: 2 3 0 1 

3 2 10 

M-bus transfers: 23 01 67 45 

23 01 67 45 

First Address: 4 

First Address: 5 

CPU transfers: 4 5 6 7 

5 4 7 6 

M-bus transfers: 45 67 01 23 

45 67 01 23 

First Address: 6 

First Address: 7 

CPU transfers: 6 7 4 5 

7 6 5 4 

M-bus Transfers: 67 45 23 01 

67 45 23 01 



The remaining transfer orderings for other LR values 
can be generated similarly, as an exercise for the reader. 

For requests originated by the 82495, the MBC must 
ignore the CPU pins (CACHE#, LEN, PWT, PCD, 
PCYC, CTYP, and BE7#-BE0#). These requests are 
writebacks, allocations, or linefills. Also the MBC must 
prevent the transfer of those signals to the M-bus for 
82495 requests — for example, it must force all BE7 # - 
BEO# active during writebacks. The 82495 based re- 
quests can be recognized by: 

RDYSRC = 0 .AND. MCACHE#=0 (for write- 
backs, linefills, allocations) 

RDYSRC = 0 .AND. MCACHE#=0 .AND. 
MKEN # = 0 (for linefills, allocations) 


For posted write requests (RDYSRC = 0 and 
MCACHE# = 1), the length is 1, 2, or 4 transfers and 
the MBC must heed the BLE# -latched BE7-0#, 
LEN, and CACHE#. 


Clock Boundaries and Synchronization 

To Optimize performance, the 82495/82490 allow to- 
tal/decoupling of the CPU clock at 50 MHz from the 
M-bus clock. While both the CPU and M-bus could 
run at 50 MHz, the physical size of the M-bus would be 
severely constrained. Future faster versions of CPU and 
82495/82490 would make a synchronous M-bus even 
less feasible. However, with a 100% synchronous inter- 
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face, little time is lost in relaying requests from the 
82495 CADS# to the M-bus, and in transfering data 
from the M-bus to the CPUbus. 

Yet with careful design, a slower M-bus such as 
33 MHz can handshake with a 50 MHz 82495 with 
only a couple of clocks spent on synchronizing. Fur- 
thermore, the transfers requiring . synchronizing are 
fairly rare uncached cycles, cache misses, and snooping. 
CPU performance is improved further because 
82495/82490 always post writes destined for the 
M-bus, allowing the CPU to continue processing upon 
write cache-misses and non-cacheable writes. 

Most of the 82495 operates on the CPU CLK. Only the 
snooping control inputs operate on another clock, 
called SNPCLK (SNPSTB#, SNPINV, SNPNCA). 
SNPCLK can be the same as the MCLK controlling 
82490 MDATA. A SNPCLK can be used with 82495, 
even if the 82490 is strobed without an MCLK. All 
82495 outputs, including snooping results (MHITM#, 
MTHIT#, SNPCYC#, and SNPBSY#) remain on 
the CPU CLK. 

The 82490 operates half in the CPU CLK domain and 
half in the M-bus domain. While no control signals flow 
through 82490 between memory and the CPU, 82490 
implements a flow-through data connection of CDA- 
TA to MDATA. Synchronization of the 2 DATA 
paths is unneeded, as the control signal MBRD Y # gets 
synched by the MBC to the CPU clocked BRDY#. 
The MBRD Y # and BRDY # inputs control multiplex- 
ers inside 82490 to choose which part of a line-fill or 
write is transferred to/from the bus. The MDATA in- 
put latches are closed on MCLK (or MISTB for non- 
clocked operation), and CD AT A input latches are 
closed with CLK. 


If MCLK=CLK at 50 MHz, approximately 1.5 CLK 
periods are required to transfer data through the 82490, 
including 82490 propogation delay (15 ns) and setup 
time to both the 82490 (5 ns) and CPU (7 ns for i860 
XP CPU “CMOS” levels). The MBC must assure data 
setup time at the CPU D0-D31 (D63) pins to the ris- 
ing edge of CLK for the cycle of BRDY # assertion 
during reads, based on the propogation delay from 
MDATA to CDATA listed in the 82490 AC timing 
specs. Writes are not flow-through, as 82490 always 
buffers the write-data and later 82495 gives CDTS# for 
the write. 

Most of the MBC- to- 82490 signals are sampled by 
82490 with MCLK, except for BRDY# and CRDY#: 

MBC 82490 Signals 
MCLK CLK 

MBRDY# BRDY# 

MFRZ# CRDY# 

MZBT# 

MDATA CDATA 

MSEL# 

MEOC# 

MDOE# (asynchronous to both clocks) 

The MBC must be partitioned into an MCLK side and 
a CLK side. Fortunately, the CPU-side of MBC passes 
only a few signals to the MCLK side, and visa versa. 
The signals listed below from the dual-i860 XP CPU 
MBC design in Appendix C must go through a syn- 
chronizer. Refer to the Appendix for signal definitions. 
In the following diagram, a right-arrow ( — > ) identi- 
fies synchronizing to CLK, while a left-arrow ( ) 

means synchronizers on MCLK: 


MCLK or SNPCLK 

MRESET 

YBGT# 

YMEOC# 

YCEOC# 

MBRDY# 

MSWEND# 

MADS# 


Clock Domain of the Signal: 


Neither 

CLK 


RESET 


BGT# 


CRDY# 


BRDY# maybe 


BRDY# maybe 

^ MSWENDA ^ 

SWEND# 


CADS# .or. SNPADS# .or. CDTS# 


The signals MKWEND# and MNA# might also need synchronizing to CLK, if they are derived from M-bus 
responses. 
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Two TI 74AS4374 Dual-Rank Synchronizer” chips 
(Figure 7) are used to transfer critical signals between 
clock domains, while avoiding metastability. This 20- 
pin DIP has one clock input and 8 pairs of flip-flops. 
Thus each of the 8 “Q” outputs reflects the value of its 
“D” input after 2 clock periods. One chip is clocked by 
CLK and the other by MCLK. If fewer than 8 signals 
need synchronizing, chips such as the Signetics 
74F50728 or Intel’s 85C220 EPLD can combine syn- 
chronization with other functions [Ham90]. 

For an asynchronous or strobed memory bus, M-bus 
signals (such as MBRDY#) get delayed by the syn- 
chronizer for 2 CLK periods before the 82495 can see 
them. For a clocked (but not by CLK) M-bus, 82495 
outputs (such as CADS#) get delayed by 2 MCLKs by 
the other synchronizer before the M-bus sees them. 

The following 82495 signals are defined as “asynchro- 
nous”, meaning that no external synchronizer is re- 
quired: 

• FLUSH#, SYNC# 

• MALE, MBALE 

• MADE#, MBAOE# 

Many signals can cross clock boundaries without syn- 
chronizing, because they will be ignored until corre- 


sponding status signals such as SWEND# and 
CADS# have been synchronized by the MBC. Thus 
they will be stable when sampled: 

• MWB/WT#, DRCTM#, MTHIT#, MHITM# 
(sampled when SWEND#) 

• RDYSRC, KLOCK#, CPLOCK#, CW/R#, 
CD/C#, CM/IO#, MCACHE#, BE7:# (sampled 
when CADS#) 

Other signals do not cross clock boundaries, but remain 
within the MBC CLK logic: ^ . 

• CNA#, PALLC#, CACHE#, LEN, PCD, PWT, •* 
CTYP, PCYC, MFRZ# . . . 


Synchronizer Delays 

To avoid lost time due to synchronizer delays, the fol- 
lowing options exist: 

1. Pipeline the 82495/MBC interface. This hides the 
delay in synchronizing CADS # to its MCLK coun- 
terpart MADS#. 

2. Define the M-bus protocol so that MBRDY# pre- 
cedes MDATA by 1 MCLK for reads. Thus the 2 
CLK delay in creating BRDY # from MBRDY # is 
hidden. Likewise define MSWEND# to precede 
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MHITM # and MTHIT # by a CLK, by generating 
MS WEND# from SNPCYC#. 

3. Keep the snooping signals (SWEND#, MHITM#, 
MTHIT#, SNPINV, SNPCYC#) which flow be- 
tween 82495s on the same CLK, so that no synchro- 
nizers enter the snoop path. This is feasible only for a 
small number of physically proximate CPUs. 

4. Synchronize the snooping feedback signals from the 
M-bus (MSWEND#, etc...) only at the destination. 
They will be asynchronous to MCLK, transitioning 
with the individual CLK of their source. 

5. Avoid MCLK, using a strobed-only M-bus. Strobed 
buses appear in single-CPU systems with an un- 
docked DRAM interface. 

6. Activate MEOC# to 82490 as soon as possible after 
the last MBRDY#. MEOC# allows 82490 to begin 
the next data transfer without waiting for CRDY # 
synchronization. 


BRDY# Generation 

Below are recommended sequences of the 82490 and 
CPU burst-transfer “Readys” for CPU reads, assuming 
the bus widths are equal. Sequences with more clocks of 
delay are acceptable but suboptimal. 

1) Synchronous M-bus (MCLK = CLK): MBRDY# 
precedes BRDY # by 1 or 2 CLKs, to allow propo- 
gation time for data through the 82490 and setup 
time at the CPU pins. 

2) “Divided Synchronous” M-bus (e.g., CLK =50 
MHz, MCLK = 25 MHz, skew controlled): 
MBRDY# precedes BRDY# by 1 or 2 CLKs. The 
BRDY# state machine must ignore MBRDY# in 
the CLK period after it was sampled active. 

3) Other Clocked M-bus (MCLK < CLK): 
MBRDY # must go through a dual-rank synchroni- 
zer latch (such as the TI 74AS4374) clocked by 
CLK to produce BRDY#. That means 2 CLK de- 
lays between MBRDY# and BRDY#. MBRDY# 
MUST remain active for at least 1 CLK period to 
assure that the synchronizer latched it active. To 
avoid one MBRDY# getting wrongly sampled ac- 
tive twice, the BRDY # state machine should ignore 
any second MBRDY# in the CLK period after it 
was sampled active. 

4) Strobed M-biis: here MISTB# must go through the 
synchronizer with 2 CLK delays to create BRDY # . 
An edge-sensitive strobed M-bus avoids the problem 
of wrongly converting one M-bus transfer to 2 
BRDY#s, as a level-change marks each M-bus 
transfer. 


When M-bus width is greater than CPUbus width, the 
above rule holds only for the first BRDY # . Successive 
BRDY # activations follow the rules below: 

• M-bus = 2*CPUbus: 2 BRDY #s occur for each of 
the first 2 MBRDY #s. The second BRDY # should 
occur 1 CLK after the first. The third BRDY # can- 
not begin until after the second MBRDY # . 

• M-bus = 4*CPUbus: 4 BRDY#s occur for the 
MBRDY # . The last 3 BRDY #s can occur immedi- 
ately in the 3 CLKs after the first BRDY # . 

For asynchronous systems (MCLK < CLK), high per- 
formance design choices are: 

M-bus width = 2 * CPUbus width OR 
M-bus width = 4 * CPUbus width 

The wider M-bus allows each M-bus transfer to satisfy 
2 or 4 CPU transfers, so that the CPU is not starved for 
data during a line fill. The 82490 switches its CDATA 
outputs to the next value the CLK after BRDY # asser- 
tion by the MBC for the current value, so the MBC 
controls the provision of data to the CPU on linefills. 

A low-cost MBC can use M-bus width = CPUbus with 
a slower MCLK, by converting the first MBRDY # to 
BRDY# through a synchronizer. The last 3 BRDY#s 
can be asserted by MBC after completion of all the M- 
bus transfers. That will allow the CPU to proceed exe- 
cuting after receiving the first datum, which is the one 
it was waiting for in most cases. Alternatively, the M- 
bus protocol can be defined so that no idle clocks occur 
on M-bus after the first MBRDY# and the MBC 
knows by counting CLKs when to assert successive 
BRDY#s. 

Shown in the following timing diagrams are data trans- 
fers on both buses for CPU reads. Although they as- 
sume no dead clocks (wait states) during the M-bus 
burst, dead clocks are allowable. 

Writes are not shown in the diagrams because the MBC 
never supplies the CPU BRDY#s for burst writes. 
RDYSRC = 0 for most writes, and the 82495 controls 
the CPUbus transfers. The exception to this rule is I/O 
writes, which 82495 does not post; for I/O writes, the 
MBC supplies BRDY # to the CPU, but I/O accesses 
are always 1 non-bursting transfer. 
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240957-8 

Figure 8. Data Transfers, M-bus Width = CPUbus Width. MCLK = CLK 



Figure 9. Data Transfers, M-bus Width = CPUbus Width. CLK/2 < MCLK < CLK. 
Note the starvation on the CPUbus (extra wait state) 
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Figure 1 1. Data Transfers, M-bus Width = 4*CPUbus 
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Pipelining 

Pipelining the MBC-to-82495 interface reduces latency 
by allowing the MBC to arbitrate for the next M-bus 
transaction while the first is proceeding. If the M-bus is 
also pipelined, it allows the snoop for the next to begin 
during the data transfer for the first. 

Signals used in pipelining the . 82495 are CNA#, 
BGT#, MALE, KWEND#, SWEND#, and 
CDTS#. The 82495 will not listen to CNA# until the 
clock of BGT# activation. Also, KWEND# activation 
sometimes allows the 82495 to create a next cycle, such 
as an allocation after a write miss. MALE deassertion 
allows the memory address to remain at the value for a 
previous request, even though the next request CADS# 
and other control signals have already occurred in re- 
sponse to CNA#. The MBC must latch the 82495 out- 
put signals which change in response to CNA#, until 
their status no longer matters to ongoing cycles. 

Note that 82495 and 82490 automatically pipeline the 
CPUbus interface to i860 XP CPU by activating NA# 
and latching address and data. 

Pipelining the M-bus itself involves sending a next ad- 
dress for snooping and DRAM access while data trans- 
fer from the current address still remains incomplete. 
This increases bandwidth by overlapping slow DRAM 


access with bus data and address transfers, as in the 
i860 XP CPU pipelined bus. 

While each 82495 allows only a one-stage deep pipeline, 
the M-bus can have a deeper pipe as requests from sev- 
eral different 82495s can be in progress. The number of 
stages in the M-bus pipe should match memory access 
latency. For example, use 3 stages for a 240 ns mem- 
ory with a 120 ns bus MADS#-to-MNA# (and 
SWEND#) time, so that a second and third request get 
issued during the memory latency of the first. Pipelin- 
ing does not imply that multiple snoops are ongoing 
waiting for SWEND#; that is a split-transaction bus, 
defined in a later section. Thus a quick SWEND# 
turnaround time speeds a new request onto the M-bus. 

The advantage of a pipelined bus using a 4-transfer 
burst is illustrated in Figures 12 and 13. Assumed is a 
fast memory access time of 4 MCLKs. With a slower 
access time, pipelining becomes more important for 
maintaining data bus bandwidth; even with the 
4-MCLK access, the unpipelined data bus is idle 50% 
of the time. 


M-bus Arbitration 

If the M-bus possesses more than one master, each 
MBC must arbitrate to gain control of the M-bus when- 
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Figure 12. Data Transfers for Non-Ptpelined M-bus. Note low MDATA Bandwidth. 
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Figure 13. Data Transfers for Pipeiined M-bus 
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ever its 82495 activates CADS#. No arbitration logic is 
included in 82495 nor 82490, except for the ability to 
float. (Hi-Impedance) the 82495 and 82490 M-bus out- 
puts via the MADE# and MDOE# signals. The 
BGT# and MADE# inputs to 82495 are from MBC 
arbitration logic. The simplest systems can use a 
HOLD/HLDA/BREQ protocol like the i860 XP CPU 
and Intel486 DX CPUs themselves, which is central- 
ized arbitration. 

Expandible buses like Futurebus + and Multibus-II use 
distributed arbitration to allow a variable number of 
masters. Bus parking (retaining ownership of the M-bus 
until another master requests it) is advised to avoid un- 
necessary delay. 

The “restricted backoff protocol” of 82495 requires 
that it be granted the bus for a modified-line writeback 
after it activates MHITM # , before it will snoop or ini- 
tiate any other transactions. The snooping MBC must 
relinquish the M-bus immediately after the CRDY # of 
the M-line writeback so that the original owner can 
complete its work. 


Sequencing 

A typical sequence of request and response signals be- 
tween the 82495 and MBC is shown in Figure 14. The 
“SL” entities (CPUsl^ 82495sl, 82490sl, MBCsl) are 
for another CPU/Cache core, the SLave(s) who snoop 
when the master CPU owns the bus. No DMA (such as 
EISA or MCA) interaction is shown, but it will be simi- 
lar to the CPU responses, except that no writeback will 
be done by DMA. Time increases downward. A minus- 
sign prefix means deassertion. 


The arbitration for the M-bus shown in the diagram 
assumes a HOLD/HLDA protocol like the CPUs use. 
That is a primitive centralized scheme, suitable only for 
a small number of processors. 

The sequencing may vary from that shown; for exam- 
ple, MSEL# may precede CDTS#. MADS#, 
MW/R#, MA31:3, MM/IO#, MD/C#, and 
MBE7-0# would all be valid simultaneously. The sig- 
nals in parentheses would be asserted only in the case of 
a M-line hit in the snooper, and some signals for that 
writeback and possible cache-to-cache transfer are not 
shown. 


2 
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CPU 82490 82495 MBC M-BUS MBCgL 82495 sl 82490sl CPUgL 



CPU 82490 82495 MBC M-BUS MBC^L 82495sl 82490sl CPUgL 



CPU 82490 82495 MBC M-BUS MBCgt^ 82495 sl 82490s|_ CPUgL 


240957-14 

* = Signal might occur sooner or not at all, depending on the type of request and bus protocol. 

** = These lines of the sequence occur only on a Hit-to-Modified (MHITM*) 

Figure 14. MBC Signals and Protocol Layers 
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Flowchart of MBC Algorithm (not applicable to all cases) 


CADS# 

i 

M-bus already owned by this MBC ? 

Y |N 

Arbitrate for bus, 

1 

*-► Enable 82495 to drive address to bus (MADE#, MALE). 

Echo other request parameters (MW/R#, MCACHE#, etc...) to the bus. 

i 

► Assert BGT#. 

i 

Determine cacheability, assert pins KWEND#, MKEN#, MRO#. 

Latch control signals (MW/R#, etc...). 

Assert CNA# to invoke next 82495 request. 

I 

MHITM# from other masters ? 

“( 1 ' 

Abort Memory cycle. Do Cache-to-cache transfer. 

i 

-► Walt for CDTS# (before beginning data transfer). 

i 

Forward snoop responses to master 82495 
^ using SWEND#, MWB/WT#, DRCTM#. 

Signal burst transfers of M-bus via MBRDY#. 

If RDYSRC = 1,echo burst transfer acknowledgments on BRDY#. 

Compensate for LR01 by stopping BRDY# assertion when CPU line filled. 

I 

Notify 82495 and 82490 of completion of transfer via MEOC# and CRDY#. 

I 

New CADS# ? 

“ 1 " 

Relinquish bus ownership. 

Deassert MADE# to re-enable snooping by this 82495. 

240957-25 
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Cacheability of each request must be determined by the 
MBC to prevent the 82495 and CPU from caching 
things like memory-mapped I/O device registers. The 
i860 XP CPU CPU samples its KEN# (Cache ENable) 
pin at the time of the first BRD Y # for a transfer or at 
NA#, whichever comes first. The 82495 offers more 
flexibility than the CPU cacheability indicators, by us- 
ing the KWEND# (cacheability Windown END) in- 
put to indicate validity of the MKEN# and MRO# 
pins. The values of MKEN # and MRO # are based on 
address decode, either locally in the MBC or from a 
centralized decoder on the memory bus. For best per- 
formance, KWEND# should come as soon as possible, 
as it allows 82495 to decide what the next CADS# 
should be — for example, to begin an allocation for a 
write miss, or to start another writethrough. 

A typical implementation would activate KWEND # 2 
clocks after CADS#, using a PLD or fast SRAM to 
decode the upper bits of the address to generate 
MKEN# and MRO#. 

Note that KWEND#, SWEND#, and BGT# need 
not be asserted by the MBC for SNPADS# cycles 
(snoop writebacks), but it may be simpler to assert 
them always. 


Snooping 

Snoop handshaking (bus watching) is useful in a multi- 
processor system, and may be needed in a uniprocessor 
system where the 82495 and CPU caches must be kept 
consistent with DMA accesses. The 82495 must snoop 
all DMA accesses to memory. The MBC sees requests 
from DMA (or other processors) on M-bus and con- 
verts them to SNPSTB# activations to the 82495. The 
following scenarios are possible: 

• DMA (or other processor) read causes 82495 
MHITM#: 82495/82490 must writeback the modi- 
fied line to memory before the first DMA data 
transfer occurs (unless the DMA controller is capa- 
ble of re-trying the read. If the DMA can retry, then 
the 82495 writeback must cause the initial DMA 
access to be aborted.) The MBC can assert 
SNPNCA (SNooP Non-CAcheable Access) to the 
82495 for a DMA read, so that the 82495 knows it 
can keep the block Exclusive upon a hit. 

• DMA (or other) read causes 82495 MTHIT# but 
not MHITM#: MBC must assert the “shared” 
status line of the M-bus, if the bus includes such a 
line. 

• DMA (or other) write causes 82495 MHITM#: 
82495/82490 must writeback the modified line to 
memory before the first DMA data transfer occurs. 
SNPINV should be activated to 82495 to invalidate 
the line. 

• DMA (or other) write causes 82495 MTHIT # but 
not MHITM#: SNPINV should be activated to 


82495 to invalidate the line. Note that 82495/82490 
cannot “write snarf’ — they do not absorb write-data 
from the memory bus and merge it with current cached 
contents of the line. However, they can absorb a full- 
line writeback from the M-bus when doing a linefill of 
the same address (see the section on Cache-to-Cache- 
Transfers). 

Bus size adaptation can be done by the MBC, although 
it is not necessary in most systems. In an Intel486 DX 
CPU or i860 XP CPU system without an 82495/82490, 
an 8-bit device like a ROM can be used to contain code, 
and the CPU will automatically fetch at byte-width 
when the BS8# (Intel486 DX CPU) or CS8 (i860 XP 
CPU) pin is asserted. However, if a byte-wide ROM is 
used with an 82495/82490, adaptation of this byte in- 
terface is required from the MBC. 

If the ROM code is to be cacheable, the MBC must 
convert the 82495 line fetches at the ROM location to 
the appropriate number of byte-wide ROM reads. 
Latching transceivers must be employed at the 82490 
MDATA inputs or at the ROM output, to assemble the 
single-byte ROM reads into 4 (or 8) bus-width-wide 
transfers to the 82490s. 

If the particular M-bus protocol requires transfer 
widths shorter than the 82490 data width used, the ad- 
dress range requiring such transfers can be made non- 
cacheable to force 82495 and 82490 to use the width 
given in the request from the CPU. 

Bus size adaptation would also be needed to support a 
512kB cache on a 32-bit memory bus. In that case, the 
MBC must control transceivers and MBRDY #s to in- 
terface between the 64-bit 82490 MDATA path and the 
32-bit M-bus. 


Bus Signal Levels 

Redriving 82495/82490 signals to the M-bus (such as 
MDATA, addresses, and 82495 control outputs) can 
optionally be done by the MBC. If the M-bus signal 
levels are not TTL, like ECL or Futurebus+ BTL 
(Backplane Transceiver Level), then appropriate trans- 
ceivers must lie between the M-bus and 82495/82490. 
Also M-buses with heavy capacitive loads should be 
redriven by transceivers, although 82495 and 82490 can 
tolerate loads of up to 100 pF. 

An additional advantage of buffering the 82495/82490 
signals with transceivers in a multiprocessor is that a 
“local M-bus” will exist between the chips and the 
main system M-bus. That allows some local traffic from 
the CPU module to attached peripherals to avoid tra- 
versing the M-bus. Such peripherals might include an 
MPIC/CCU (Multiprocessor Interrupt Controller/ 
Concurrency Control Unit), a JTAG boundary-scan 
controller, or a time-of-day clock, as in the Sequent 
Symmetry multiprocessor. 
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8.0 MBC FUNCTIONS FOR 
MULTIPROCESSORS 

Multiprocessor cache designs have additional motiva- 
tions beyond the uniprocessor goal of reducing memory 
access latency. Reducing memory bus usage is especial- 
ly important because the sharing of the bus creates a 
bottleneck. Thus multi-82495 systems need to minimize 
the number of transactions and make each one as short 
as possible. Large caches (256k or 512k) are recom- 
mended for multis, to keep the miss rate as low as pos- 
sible. 

In addition to the uniprocessor functions, an MBC in a 
multiprocessor must handle consistency with caching 
agents other than its own 82495. The multiprocessor 
MBC may also for performance reasons implement 
snoop filtering, cache-to-cache transfers, read-for-own- 
ership, and split transactions. 

Snooping results from listeners (slaves) on the bus must 
be fed back to the master 82495 by the time SWEND# 
is activated, if the system uses writeback policy (write- 


through requires no feedback). These results 
(DRCTM#, MWB/WT#) are translations of the 
slaves’ MHITM# and MTHIT# outputs. As shown in 
Figure 15, typically all MHITM# outputs would be 
wired-or via open-collector transceivers. Because slaves 
on the bus may be busy with CPU operations and back- 
invalidations, the snoop delay can vary. Thus a latched 
derivative of the SNPCYC# output of all 82495s 
would be wired-or to derive SWEND # . Alternatively, 
the MBC can count CLKs to generate SWEND#, us- 
ing the worst-case upper-bound of CLKs required for 
all 82495s to snoop, but that makes all snoop windows 
long. 


Because 82495 will tolerate SWEND# arrival up until 
CRDY #, the M-bus data transfer for reads can overlap 
the snooping delay. The transfers (MBRDY #s) can oc- 
cur during snoop latency, and an MHITM # activation 
would cause the MBC to restart the transfer using 
82490’s MSEL# pin. 



If a 82495 linefill or writethrough hits a dirty line in 
another cache, the MBC cannot BACKOFF the 82495. 


MADS# 
MSWENDA 
MINHIBIT# (MHITM#) 
MSHARED# (MTHIT#) 
MSNPINV 



240957-15 


Figure 15. Creating Snoop Results from MHITM#, MTHIT #, and SNPCYC# 
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Labeling that other cache “the dirty 82495,” and the 
initiating 82495 “the master 82495”. The master MBC 
must force a retry of the access after the dirty 82495 
dumps the line, but the master 82495 has no “Backoff 
and Retry” input pin. Rather, on a linefill the master 
82495 must see the data transfer as if it had come from 
memory. On a write, the master 82495/82490 data 
write must wait until the modified line from the dirty 
82495 has been dumped to memory. To do so, the mas- 
ter MBC can either: 

1) Delay the corresponding MBRDY#s to the master 
82490 until the modified line is completely written 
into memory and read out of memory. That implies 
the master MBC will remake the initial request to 
the memory controller after the writeback. 

OR 

2) Create a cache-to-cache transfer, so that the write- 
back data movements go directly into the master 
82490 over the M-bus. A later section describes 
cache-to-cache transfers. Such transfers are quicker 
than waiting for the entire modified line to be writ- 
ten back to memory. 

Note that the 82490 can restart the data transfer for 
reads or writes, in the case of MHITM# activation 
after the first MBRDY# but before MEOC# and be- 
fore CRDY#. To restart the 82490, the MBC must 
deassert MSEL# for at least 1 MCLK. 

Snoop Window Time (the delay from MADS# to 
SWEND#) limits address-bus bandwidth. In the inter- 
val from the address on M-bus until the acknowledge- 
ment (SNPCYC#) by all listeners, no more requests 
(addresses) can be on the bus. This restriction is im- 
plied by: 

1) A typical M-bus has only one MS WEND# wire, 
which cannot be identified with the proper request if 
several requsts are outstanding. 

2) 82495 does not snoop between BGT # and 
SWEND#. 

3) 82495’s “restricted backoff protocol”. That protocol 
requires the M-line writeback to be the first transac- 
tion by any 82495 which generates MHITM#, and 
82495 cannot snoop anymore until it finishes the 
MHITM# writeback. 

Data for read-misses cannot be transferred on the 
CPUbus until SWEND#, because the MBC cannot 
abort a CPU transfer after giving the first BRDY#. 
Thus the snoop window length influences CPU per- 
formance. Depending on the number of processors, bus 
speed, and memory speed, two scenarios arise from 
snoop window length versus memory access latency: 

1) SWindow < Memory Latency: SWEND# precedes 
the MBRDY #s. If MHITM# occurs, the original 
memory access can be aborted and its MBRDY #s 
must be ignored. 


2) SWindow > Memory Latency: data transfer on 
M-bus can proceed, with MBRDY # s causing 82490 
linefill buffers to advance. After SWEND#, the 
MBC can begin BRDY#s to the CPU and 82490 if 
MHITM# is inactive. If MHITM# is active, the 
MBC must restart the M-bus data transfers after (or 
during) the writeback from the modified snooper, 
and can begin BRDY#s immediately after the first 
MBRDY#. 

The typical snoop window in a multiprocessor using 
the hardware of Figure 1 5 is about 7 CLKs total snoop 
turnaround delay, shown in Figure 16: 

1 CLK for propagation delay of master’s 

MADS# (to slave 82495s’ SNPSTB# in- 
puts) 

+ 0.5 to 1 CLK for 82495 to internally latch 
SNPSTB # and synchronize it to CLK. 

+ 1 CLK for 82495 tag lookup and SNPCYC# 
(or more, if 82495 is busy with SNPBSY #) 
+ 1 CLK to latch SNPCYC# into the MBC 
Set/Reset flip-flop generating MSWEN- 
DA. 

+ 1 CLK for MSWENDA open-collector buff- 

er and settling time from all slaves. 

+ 2 CLKs for MSWENDA to get through syn- 
chronizer (on the master MBC’s CLK) and 
inverter to generate SWEND # to the mas- 
ter 82495. 

The window total assumes that the slave 82495s’ one 
CLK delay from SNPCYC# until MHITM# is con- 
current with the synchronizer delay for creating 
SWEND# from MSWENDA at the master. Those 2 
CLKs can overlap with the next MADS # if it is asyn- 
chronously generated from MSWENDA. Shorter 
snoop window times can be obtained using duplicate 
external tags as explained later, but this is not trivial. 

Read for Ownership (RFO) protocols decrease bus traf- 
fic by avoiding the M-bus write which would occur 
upon a write-miss. That is, a write-miss would go to the 
bus, followed by a 82495 line allocation request for the 
missed area. With RFO, the MBC does not echo the 
82495 write request to the M-bus. Instead, it asserts 
MFRZ# to freeze the written data in the 82490 memo- 
ry buffer, and allows the subsequent 82495 allocate line 
request to go to the bus. When the line data returns on 
the M-bus, MBC asserts DRCTM# to cause the 82495 
to mark the line as Modified (the memory system and 
other caching agents do not know of the original write 
miss, so they have invalid copies of the line). 

Signals which the MBC must use to do RFO are: 

1) PALLC# (Potential ALLoCate): from the 82495 
must be active on the write miss. If not, RFO cannot 
be performed. 

2) MKEN # and CRDY # : must be activated by the 
MBC for the write, to trigger the 82495’s subsequent 
allocation request 
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Figure 16. Snoop Waveforms 



3) MFRZ#: must be activated by the MBC to the 
82490 at the time of the MEOC# and CRDY # for 
the write. 

4) INVAL (memory bus Invalidate indication): must 
be asserted by the MBC during the allocate-read to 
force all other 82495s to invalidate their now-obso- 
lete copies of the line. Slave MBCs will assert 
SNPINV to 82495s. 

5) DRCTM# (DiReCt To Modified): must be asserted 
by the MBC during the S WEND # of the allocate, to 
make the 82495 put the line in M-state. 

6) MWB/WT # : must be asserted during the 
SWEND# of the allocate. 

7) CPLOCK# (82495 Psuedo Lock in Intel486 DX 
CPU systems): if active, the MBC must NOT do 
RFO, because 82495 will activate PALLC# only on 
the second of the 2 writes. If the MBC tried to RFO, 
it would merge only half of the data into the modi- 
fied line. 

See [82495/490DS] for RFO information. 

Cache-to-cache transfers (CTCT) optimize the speed of 
consistency actions in a multiprocessor. For a read line- 
flll by a master causing an MHITM# from a slave, the 
writeback data movements go directly into the master 
82490 over the M-bus from the dirty 82490. For a 
write, Read-for-Ownership (RFO) is required for the 
CTCT. If RFO is not implemented, then the cache-to- 
cache option can be used only on linefill (read) misses. 
In fact, RFO makes every write-miss into a linefill. The 
82495/82490 do CTCT only on entire lines, not bytes 
or words. 


For CTCT on a linefill causing MHITM#, the MBC 
doing the writeback must initiate the writeback at the 
subline address of the initial read. Starting the write- 
back from the first word of the line is NOT acceptable. 

While CTCT is faster than re-reading the line after 
waiting for the dirty writeback, the latency will be long- 
er in most systems than for fetching lines from main 
memory. CTCT would actually waste time for such 
items as shared instruction pages. For non-written data, 
transferring from memory to a CPU is probably faster 
than tranferring from another cache. So 82495 supports 
only M-line CTCT (no writeback occurs unless 
MHITM#). 

Signals involved in CTCT are DRCTM#, MZBT#, 
MHITM#, MBAOE#, and MSEL#. See 
[82495/490DS1 for CTCT information. 

Snoop filtering can be implemented by the MBC using 
the 82495 SMLN # (SaMe LiNe) output to reduce the 
latency for snooping. That is, SWEND# can be assert- 
ed immediately to the requesting 82495, if the 82495 
asserts SMLN # to indicate the current request is to the 
same line as the previous request. In that case, other 
caches already have checked this line. SMLN# must 
be ignored if the M-bus has been used by other agents 
between the 2 82495 requests. The M-bus protocol need 
not include a “non-snooped transfer type” for the use 
of this feature, as the MBC can simply ignore the snoop 
responses from other MBC/82495 modules. 
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Split transaction (ST) memory-buses such as Future- 
bus + prove valuable in high performance systems. An 
ST (also called “connect/disconnect” or “packet 
switching”) bus divides a single read request into a sep- 
arate address-transfer phase and a data-transfer phase. 
Thus the bus is not monopolized during the long laten- 
cy involved in accessing data across bus hierarchies. 
Writes typically are not split, as the data and address 
are available simultaneously from the writer. In a hier- 
archical bus, requests must be forwarded across bridges 
for the purposes of snooping and memory access at re- 
mote nodes, and the snoop latency may be long. Thus 
the bus should be freed between initial request and 
snoop-response for use in other transactions. 

The 82495 does not support ST directly. That would 
require snooping current cache contents and queue-up 
possible writebacks, for the accesses from other bus 
agents between the time of the BGT# (the address 
phase) and SWEND# (end of the address phase or 
later). Also 82495 cannot writeback dirty data between 
SWEND# and CRDY # (end of the data phase) of an 
ongoing cycle; it cannot suspend a transfer for later 
resumption after a snoop writeback. 


CADS# BGT# SWEND# CRDY# 

I 1 NNNNNNNNNNN I DDDDDDDDDDD I 


NN = No snooping by 82495 will occur In this area 

DD = Delayed response by 82495 to snoop requests 
here. MTHIT# and MHITM# asserted immedi- 
ately, but writeback of dirty data delayed until af- 
ter CRDY# for ongoing cycle. 


82495’s inability to snoop during the NN period comes 
from the need to keep 2 addresses into the tags active — 
one for the outstanding 82495 request, whose tag must 
be updated at SWEND# based on MWB/WT# and 
DRCTM#, and one for the snoop inquiry. Further- 
more, any MHITM# on the M-bus could not be easily 
linked to the request causing the snoop if 2 snoops are 
outstanding. 

To support split transactions by snooping between 
BGT# and SWEND#, a set of tags external to the 
82495 can be implemented in the MBC. Those tags 
would replicate the contents of the 82495 internal tags, 
listening to all memory bus requests and responding 
with snoop results. Only when a 82495 state change (to 
I or S) is needed will the 82495 be informed of snooping 
action — only then will the external tags relay the snoop 
request to it. 




Duplicate tags provide quicker snoop turnaround be- 
cause no SNPCLK-to-CLK synchronization is re- 


quired; the duplicate tags are in the SNPCLK/MCLK 
logic. While they are a high-performance option, they 
are costly and complex. 

Memory cycle abort is required in multiprocessors 
when a snooping 82495 activates MHITM# to signal 
that the memory’s copy of the data requested by anoth- 
er 82495 is obsolete. As explained above, the memory 
read or write must be INHIBITED until the writeback 
is done. Depending on implementation, the original ac- 
cess may need to be retried or abandoned. If CTCT and 
RFO are implemented, then abandonment is probably 
adequate. Although the complexity of aborting could 
be avoided by delaying all memory^-.%^ion until 
SWEND#, that would decrease perform^ce. An 
M-bus signal such as “SIV” (System Intervene) or 
“MBOFF#” (M-bus Back OFF) allows the MBC of 
the snooper to tell memory to abort. 

If the M-bus is pipelined, there may be constraints on 
when the MBC can assert the “abort” signal to avoid 
cancelling the access in progress for the transfer preced- 
ing the one causing MHITM#. 


Locking 

Locking of the M-bus using the 82495’s KLOCK# 
output is required to ensure atomic accesses for CPU 
locks. For example, memory variables called sema- 
phores in a multitasking airline-reservation system pre- 
vent two processes from trying to update the same list 
of flight reservations simultaneously. A task would read 
the value of the semaphore in an uninterrupted read- 
modify-write (RMW) sequence, asserting the CPU’s 
LOCK# signal during the RMW to block interrupts ^ 
(and block locked accesses by other processors to the 
same semaphore in a multiprocessor). If interrupts or 
other accesses were allowed during the sequence, two 
processes (or processors) might both read the sema- 
phore as “available” (zero) and both assume ownership, 
setting it to “unavailable” (nonzero). Then both might 
find the same empty seat and write their individual pas- 
senger’s name in the same seat location. In the end, 101 
passengers would have tickets for a 100-seat plane 
flight. 

The 82495 and i860 XP CPU implement locks in a 
sequentially consistent, or serializing, manner. That is, 
all data loads and stores within the locked sequence 
occur on the external bus in the same order as they 
appear in the program. Also, all accesses in the pro- 
gram before the LOCK instruction are completed be- 
fore the first locked read or write, and all the locked 
reads/writes complete before other accesses after the 
locked sequence. This sequentiality is required by the 
semaphore example above, to prevent the CPU from 
updating the reservation list before it has obtained own- 
ership using the semaphore. 


■•The CPU automatically blocks Interrupts during the LOCKed sequence. The bus arbiter is responsible for blocking other accesses. 
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The MBC must serialize by ensuring all back-invali- 
dates from 82495 to the CPU have completed before 
activating BRDY # for any locked read or write. So the 
MBC must postpone locked BRDY #s until CAHOLD 
is inactive and SNPCYC# has been inactive at least 2 
CLKs (refer to [82495/490DS] section 5.1.1). 


Bus Lock vs. Address Lock 

The 82495 echoes the CPU’s LOCK# signal onto its 
KLOCK# output, and forces all CPU accesses to go to 
the M-bus, even if they are 82495 cache hits. That guar- 
antees that -other processors know of the LOCK and 
the accesses. The 82495 assumes a BUS LOCK, where 
all other processors are kept off the bus during 
KLOCK# activation. Most existing “standard” buses, 
such as Multibus-II, have lock protocols which do such 
an exclusive lock. 82495 snoop behavior during asser- 
tion of its own KLOCK# is undefined, since it expects 
no other requests will be permitted then. The 8249 5 ’s 
KLOCK# can remain asserted for multiple cycles 
when used with the i860 XP CPU, because the proces- 
sor allows up to 32 instructions inside a LOCKed se- 
quence. 

The 32-instruction i860 XP CPU LOCKed intervals 
may exceed 32 CLKs, as each instruction could take 
several clocks and cause a TLB miss (the intervals 
would be even longer if the i860 XP CPU did data 
cache line fills and line writebacks during LOCK#, but 
the 82495 prevents that by making KEN# = 1). Unfor- 
tunately, this limits bus concurrency, When several 
82495s share a bus or interconnection network, per- 
formance would improve if a LOCK# from one proc- 
essor did not block all others from accessing memory 
and I/O. Multiprocessors based on the Intel486 DX 
CPU are not affected as severely by LOCK#, because 
its lock endures only a few clocks — two memory ac- 
cesses at most. 

To improve performance of locks in a multiprocessor, a 
scheme of ADDRESS LOCKING may be implement- 
ed. This non-blocking protocol allows other accesses to 
the bus and memory in spite of LOCK# activation, 
and requires only that no other CPU tries to access the 
same LOCKed address. If another CPU does try to 
access the same location, that second CPU must be 
stalled until the first LOCK is de-asserted. To ensure 
that the second CPU continues to snoop accesses while 
stalled, BGT# to it for its request must be delayed 
until the lock is obtained, as signalled by the bus arbi- 
ter. Semaphore integrity is preserved if all CPUs follow 
the software convention of locking their RMW (Read- 
Modify-Write) semaphore accesses. Also by conven- 
tion, the address corresponding to the first access with 


LOCK# asserted is the only locked location permitted 
to that processor, until LOCK# deasserts (refer to the 
i860 Microprocessor Family Programmer’s Reference 
Manual Intel order #240875, Section 5-14). 


Would software want to be able to cache lockable loca- 
tions? Since they are used for interprocessor or inter- 
process communication, it might seem dangerous to 
keep them “hidden” in a cache. However, caching al- 
lows a CPU to read a semaphore repeatedly without 
generating bus traffic, waiting until the semaphore is 
free as indicated by a zero value. These reads can be 
done in non-locked fashion. If a copy of the semaphore 
is cached, no bus traffic is used for the reads, and the 
semaphore value still gets updated via the normal 
MESI consistency hardware when the semaphore’s 
owner writes it with a new value. 

KLOCK# de-assertion for back-to-back Intel486 DX 
CPU locked accesses is required of the MBC if it uses 
address-based locking, so that the lock-manager knows 
the correct address. The i860 XP CPU always deacti- 
vates LOCK# for at least one clock between separate 
locked regions, by virtue of its deactivation in the clock 
after the last locked ADS#. However, the Intel486 DX 
CPU deactivates LOCK# only in the clock after the 
last BRDY# of the last locked access. 'If'hus LOCK# 
and KLOCK# may not deactivate when two XCHG 
instructions occur in succession. The MBC can insert a 
deactivation of the M-bus MLOCK# signal by know- 
ing all Intel486 DX CPU locked accesses are Read- 
Modify-Write sequences. The MBC should deassert 
MLOCK# regardless of KLOCK #’s value, after the 
write. 



Deassertion of KLOCK# by the MBC hardware may 
be required in any Intel486 DX CPU system, to avoid 
bus timeout and starvation of other bus masters when a 
continuous stream of locked accesses occurs in one 
processor’s program. Without it, one processor could 
monopolize the bus and prevent re-arbitration. 


CPLOCK# 

CPLOCK# has a purpose similar to KLOCK# in 
Intel486 DX CPU systems, but is unused in i860 XP 
CPU systems. PLOCK# (Psuedo-LOCK) indicates an 
atomic 8-byte 2-transfer write for floating-point data 
which should not be interrupted. The 4-byte bus of the 
Intel486 DX CPU requires 2 transfers for an 8-byte 
datum, and if only half the transfer gets done before 
another bus master reads memory, half-wrong data 
could be read. 
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Thus the MBC should not relinquish the bus nor re- 
quire snoops of its 82495 from the time of the BGT# 
for the first write (when CPLOCK# was asserted by 
82495) through the BGT# of the second write. This 
increases the worst-case delay of writeback for a 82495- 
snoop-hit to a modified line; to avoid the delay, the 
MBC can tie the CPLOCK# [PLOCKEN] pin low to 
disable PLOCK functionality. 

9.0 MORE ALTERNATIVES 

In addition to the options discussed above, several oth- 
er choices affect Memory Bus Controller design. 

M-bus clocking should be chosen to allow future ver- 
sions of 82495 and 82490 at higher clock speeds. Up- 
grading the CPU module performance by replacing the 
processor and 82495/82490 will be possible. While 
some redesign of the CPU-side MBC state machines 
may be needed for faster clocks, the memon^ bus can 
remain the same. Thus an asynchronous interface with 
either a strobed unclocked M-bus or a clocked M-bus at 
less than 50 MHz is advised. A fully synchronous 
M-bus/CPU MBC would be difficult to move to higher 
clock speed. 

One convenient way to design the MBC is with the 
M-bus MCLK = 0.5*CLK. Probably it will be possi- 
ble to keep the M-bus at half the CPU CLK rate, even 
with faster CPUs. The big advantage of this half-speed 
link is that no synchronizers are needed within the 
MBC if the MCLK and CLK edges are skew-con- 
trolled. The MBC can be totally on CLK, as in the 
design example of Appendix B. 


The choice between a Strobed or Clocked M-bus is of- 
ten determined by existing bus protocols in which 
82495/82490 will be used. Most existing buses are 
clocked; however, Futurebus + requires all bus entities 
to use strobed tranfers, but allows an optional clocked 
mode for high-speed packet transfers [Fbus90]. The 
tradeoffs are shown in Table 2. 

Line size and M-bus width also determine upgradabil- 
ity to possible future versions of 82490 on the same 
M-bus, with more than 32kB per chip. If a higher-den- 
sity 82490 becomes available, the fact that 82495 has 8k 
tags requires: 

128 data bytes per tag (128 byte line, or sectored 

64-byte lines) 

AND 

8-byte or 16-byte memory bus width 

to allow a 1 MByte or 2 MByte 82490 configuration. If 
a smaller bus is used, a larger 82490 is possible, but 
the bus-size multiplexing described earlier would be 
needed. 

Writeback (WB) cache policy is advised for high-per- 
formance (multi)processors to limit bus traffic. Howev- 
er, a writethru (WT) design is simpler for the MBC 
because there never is a need to backoff the 82495 due 
to MHITM#. In fact, the snoop window in a WT sys- 
tem becomes unnecessary and SWEND# can be acti- 
vated simultaneous with KWEND# . In such a system, 
the only states of cache lines are S or I. Snooping has 
no effect during reads and only causes invalidations (in 
the slaves) for writes in a WT design. Cache-to-cache 
transfers and RFO are irrelevant. 


Table 2. Clocked vs. Strobed MBUS Tradeoffs 


CLOCKED MBUS Advantages 

Design techniques for clocked systems are well 
known. 

Fast arbitration using MCLK state machines. 

Burst transfers proceed at one datum per MCLK 

CLOCKED MBUS Disadvantages 

Must round-up delays to MCLK period quanta EG., 
33 ns delay means two 30 ns MCLKs needed. 

Some 82495-to-82495 signals must be twice 
synchronized: once at sender, once at receiver. 

Backplane length limited. 

MCLK skew must be controlled. 

Requires assumptions on CLK vs. MCLK speed 
ratio: for example, CLK > MCLK > CLK/2. 


STROBED MBUS Disadvantages 

MBC design may require delay lines and non- 
conventlonal design techniques. 

Arbitration slow because signal must be 
synchronized at arbiter and at modules. 

Burst throughput slowed if each transfer requires 
acknowledgement from receiver. 

STROBED MBUS Advantages 

Delays determined by device speed and physics, 
not by MCLK quanta. 

Each signal goes through sychronizer once, only at 
receiver, so less time is lost at synchronizers. 

Fewer limits on backplane length or capacitance 
or number of boards. 

No clock skew worries. 

Any CLK frequency will work. 
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10.0 MBC DIFFERENCES FOR i860 
XP CPU VERSUS Intel486 DX 
CPU 

The same MBC design can be used for either i860 XP 
CPU or Intel486 DX CPU if the MBC supersets the 
requirements of the two. A “CPU TYPE” configura- 

tion pin can be included in the MBC to modify its be- 
havior. First, make the features as common as possible: 

• Choose a configuration acceptable for both CPUs: 

a) 256 kBytes, 4 transfers/line, 64-bit M-bus, 32-byte 
line. 

b) 512 kBytes, 4 transfers/line, 128-bit M-bus, 
64-byte line. 

c) 256 kBytes, 8 transfers/line, 64-bit M-bus, 64-byte 
line. 

d) 512 kBytes, 8 transfers/line, 128-bit M-bus, 
128-byte line. 

• i860 XP CPU-pfld data is cached in 82490 — no opti- 
mizations are included for pfld. 

• Assume that LOCK# duration does not matter (IE, 
that back-to-back LOCK#ed requests from 
Intel486 DX CPUs and long LOCK# cycles in i860 
XP CPU do not cause bus ownership timeout). 

Features Strictly for the Intel486 DX CPU : 

• BE7-4# for M-bus must be synthesized by the 
MBC from A2 and BE3 -0 # . 

• CPLOCK# protection. 

• WRMRST (warm reset) can be included for both 
CPUs, but is optional. 

Features Strictly for the i860 XP CPU: 

• Burst writes from the CPU (Length = 2 and 
Length = 4). 

• A second 74F377 BE# -latch is needed, for i860 XP 
CPU pins BE7#-BE4#, LEN, and CACHE#. 
PCYC and CTYP can also be latched for debug pur- 
poses. 

• PCHK# output from i860 XP CPU must be ig- 
nored except during the CLK after BRDY # comes 
from the MBC. PCHK# from Intel486 DX CPU is 
always valid. 

Differences between the MBCs: 

• Configuration pin strapping of 82495 inputs. 


• Decoding CPU request burst length from 
CLEN1:0(82495 pins in Intel486 DX CPU systems) 
or LEN and CACHE# (i860 XP CPU). 

• CPU Line length — 16 bytes vs. 32 bytes (i860 XP 
CPU) means that the Intel486 DX CPU MBC will 
give 2 BRDY#s for every 1 BRDY# of the i860 
XP CPU MBC. 


Differences between Intel486 DX CPU and i860 XP 
CPUs which have no impact on MBC: 


• Intel486 DX CPU FLUSH# input pin. 

• i860 XP CPU writeback caching, HITM#, and 
BOFF#. 

• i860 XP CPU CS8 vs. Intel486 DX CPU BS8#, 
BS16# (none are really useable). 

• Intel486 DX CPU RDY# pin and interruptable 
bursts (not useable with 82495). 

• i860 XP CPU acknowledges HOLD during 
LOCK#. 



• EADS# duty cycle (50% maximum for i860 XP 
CPU and 100% for Intel486 DX CPU, but handled 
by 82495). 

• KEN # pin sampling interval by the CPU. 

• Behavior of CPU in response to BOFF # assertion. 

• i860 XP CPU BERR (Bus ERRor) pin versus 
Intel486 DX CPU NMI (Non Maskable Interrupt). 


11.0 SUMMARY 

The interface between a CPU/82495/82490 chip set 
and a system memory bus allows much flexibility and a 
wide range of performance options. The simplest MBC 
can be a few PALs, while a top-performance multipro- 
cessing version may take thousands of gates on an 
ASIC. Signal pin counts for the MBC can range from 
70 to 120, varying with the memory bus definition im- 
plemented by the MBC. 

While beyond the scope of this document, topics for 
consideration include detailed timing diagrams, critical 
path analysis, simulation of bus traffic, and hit rates. 
Useful also are simulations of performance impact of 
the number of CPUs, WB versus WT policy, memory 
latency, CTCT, RFO, and duplicate tags. Also at issue 
are interrupt controller hardware, PAX concurrency 
control, boundary scan and selftest, PC-compatibility- 
implications, i860 XP CPU pfld options, and high- 
speed design issues of impedance, termination, and 
noise. 


2-475 


AP-452 




iny. 


12.0 BIBLIOGRAPHY 

[Agarwal88] “An Evaluation of Directory Schemes for 
Cache Coherence”, by A. Agarwal, R. Simoni, J. Hen- 
nessy, and M. Horowitz (CH2524-2/8 8/000/0280 
IEEE 1988) 

[Fbus90l “Futurebus+: Its features, and how to use 
them,” John Black, in VMEbus Systems Magazine, 
Feb. 1990, p. 23:40. 

[Ham90l Tucker Hammerstrom, “Metastability,” Intel 
Techbit # PLD-0390, March 1990. 

[82495/490DS] 82495XP Cache Controller/82490XP 
Cache RAM Data Sheet, Intel order #240956. 


[i401] Intel486 DX CPU Microcomputer Model 401 
Board Technical Reference Manual, order # 504366. 

[Intel486] Intel486 DX CPU Microprocessor Data 
Sheet, Intel order #240440. 

[i860XPDS] i860 XP CPU Microprocessor Data Sheet, 
Intel order #240874. 

[Thakkar90] “Performance of an OLTP Application 
on Symmetry Multiprocessor System,” by S. Thakkar 
and M. Sweiger (CH2887-8/90/0000/0228), 1990 
IEEE Inti Conference on Computer Architecture. 


2-476 



AP-452 


[PI^ULDO^OKIAI^Y 


int^. 


APPENDIX A: 

Questions and Answers on MBC Design 


Q: Why activate BGT# early, since 82495 won’t 

snoop between BGT# and SWEND#? 

ANS: CNA# for MBC pipelining ignored until 
BGT # . Also BGT # must precede CRDY # by 
at least 3 CLKs. And BGT# must precede 
BRDY#. 

Q: How does PAX multiprocessing work with 

82495 and an MBC? 

ANS: A ecu chip must be included on the M-bus side 
of 82495 and 82490 for each i860 XP CPU in a 
PAX multiprocessor. Refer to [MPIC90]. 

Q: Can the i860 XR CPU use a 82495/82490 

cache? 

ANS: No, the bus protocol of 82495 and 82490 
matches Intel486 DX CPU and i860 XP CPUs, 
but not i860 XR CPU. 

Q: Can 2 CPUs plug into one 82495, getting effi- 

ciency from shared cache? 

ANS: No, the protocol and physical capacitance of the 
interface do not allow it. 

Q: Should the same MBC be used for Uni & Multi? 

(i.e., how much extra logic is added to make a 
multiprocessor MBC?) 

ANS: It is possible, and the extra logic is reasonable 
for a Uni which could be upgraded to multi by 
adding another CPU + cache module. 

Q: Are software models of 82495/82490 available 

for simulation of MBCs? What simulators are 
supported? 

ANS: As of September 1990, beta versions of models 
will be available Q4 1990 from Silicon West, Inc. 
Phone = (213)597-5995, FAX = (213)494- 
4588. Contact Silicon West for information on 
simulators supported (currently Workview, Ver- 
ilog, Zycad VHDL, Mentor Graphics). 

Q: What is the fastest possible transfer of data from 

Mdata to Cdata? (i.e., how many CPU elks are 
spent?) 

ANS: The initial timings are listed in [82495/490DS]. 
They are about 1.5 CLK periods including set- 
up-time at the CPU data pins. The connection 
from CDATA to MDATA is essentially a flow- 
through path. 


Q: Can the CPU-bus and Memory-Bus be on the 

same 50 MHz clock? 

ANS: Yes, but multiprocessor memory buses probably 
have too much capacitance and trace length to 
tolerate a 50 MHz clock. 


Q: What are pin-counts for an MBC (i.e., will it fit 

in my ASIC)? 

ANS: 70 to 120 signal pins, depending on the bus pro- 
tocol and MBC features. 

Q: How long is a reasonable cacheability window, 

in MCLKs? 



ANS: KWEND# is activated when MKEN# and 
MRO# are stable. MKEN# and MRO# can 
come from address decoders in the MBC or on 
the MBUS. Thus KWEND# could be 2 CLKs 
after CADS# if the MBC itself determines 
cacheability, or as much as 5 MCLKs if the M- 
bus must see the request and determine 
MKEN#. 

Q: How long is a reasonable snooping window, in 

CLKs? 


ANS: MWB/WT# and DRCTM# are generated 
from the snoopers’ MTHIT# and MHITM# 
signals. Thus SWEND# is activated when those 
signals (MWB/WT#, DRCTM#) are stable. 
That would be at least 7 CLKs, not counting the 
possible delay between CADS# and its M-bus 
counterpart MADS#, (see the discussion of 
snoop window above). 

Q: Is the SWEND# window length deterministic, 

or must SNPBSY # determine it? 


ANS: It is deterministic, but may be long when the 
82495 is busy. Yes, the SNPCYC# signal is re- 
quired to determine SWEND#. If SNPCYC# 
is not used, then the worst-case 82495 delay 
must be imbedded into the MBC logic, making 
the window longer than necessary most of the 
time. 
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Q: How long can 82495 be “busy”, activating 

SNPBSY# and ignoring subsequent SNPSTB# 
activations? 

ANS: 82495 busy-ness is not due to CPU requests, be- 
cause 82495 gives higher priority to the snoops. 
But for snoops to M-state 82495 lines, 82495 
must do inquiries to the i860 XP CPU and get 
the more-recently modified data from i860 XP 
CPU before 82495 can writeback. A 82495 con- 
nected to an Intel486 DX CPU does not need to 
get modified data, as the Intel486 DX CPU has 
only S-state lines in the CPU cache. However, if 
SNPINV was active, 82495 must back-invali- 
date either CPU for S, E, or M state lines. The 
82495 must do multiple inquires or invalidates 
when the line ratio is 2 or 4, 

Q: What is the synchronization penalty in snooping 

(ie, how long from M-bus request to MHITM # 
validity)? 

ANS: About 3 CLKs. See the discussion of “snoop 
window” above. 

Q: What is optimal 82495 cache-line length 

(32,64,128)? 

ANS: This is TBD from simulations or measurements. 
It depends on the behavior of SW applications 
the HW is intended for. 

Q: Can Futurebus+ be used as the M-bus for a 

82495/82490 system? 

ANS: Yes. The Futurebus+ spec is compatible with 
the 82495/82490. It supports MESI, strobed 
data transfer, address pipelining, cache to cache 
transfers. Read For Ownership, and many other 
features. 82490 would be used in strobed mode 
for Futurebus + . 

Q: Can 82495 do a split-transaction bus (if not, why 

not?)? 

ANS: Maybe. 82495 implements a restricted-backoff 
protocol to eliminate potential deadlock condi- 
tions in a shared bus multiprocessor environ- 
ment. Because of that protocol, and the fact that 
82495 will not snoop between BGT# and 
SWEND#, it is difficult to implement split 
transactions. It may be possible, using an addi- 
tional set of tags which replicate 82495’s and 
allow snoops to continue between BGT# and 
SWEND#. 

Q: Can another 82495 be used for the “duplicate 

tags” for split transaction snooping? 

ANS: No, the 82495 signal definitions and protocols 
make that very difficult. 

Q: Why do the KWEND# and SWEND# signals 

exist? 

ANS: SWEND#, by gating 82490-to-CPU-data-trans- 
fer, allows the M-bus data transfer simultaneous 
with snooping. In the usual case, no modified 


copy will be found by the snoopers, so that 
transfer was not wasted. The alternative (that 
data cannot be transfered from memory until 
snoops complete) costs performance or requires 
a central tag directory. SWEND# triggers the 
82495 to update its tags. 

KWEND# allows a variety of cacheability de- 
termination schemes — a long delay to determine 
MKEN# and MRO# might be needed if a pro- 
grammable RAM or EEPROM decodes cachea- 
bility based on address. If not, KWEND# can 
be activated quickly if there is a local MBC de- 
code of A31:A28 to determine MKEN#, for ex- 
ample. 

Q: Why not just one WEND signal? 

ANS: Performance. KWEND# can be determined 
quicker than line-status in most implementa- 
tions. The early knowledge of cacheability to the 
82495 allows it to begin line replacements and 
allocations, and activate the next CADS# to 
MBC. 

Q: How to connect 8-bit (or 16-bit) devices such as 

ROM and serial ports to 82490? 

ANS: If the devices are made non-cacheable, they can 
be tied to the MDATA pins of the least-signifi- 
cant 82490s. However, if fetches from them 
must be cacheable, then byte assembly logic 
(latching transceivers) must exist to allow 82490 
to transfer from them 4 or 8 bytes at a time 
(1 M-bus width per transfer). 82495 and 82490 
require all cacheable locations to do burst trans- 
fers an M-bus-width of data per transfer. 

Q: Does the 82495 have a CS8 mode? Does 82495 

support i860 XP CPU in CS8 mode? 

ANS: To support i860 XP CPU CS8 mode with 82495, 
the 8-bit ROM must be marked non-cacheable. 
This means that code being fetched in CS8 mode 
won’t be cacheable in the 82495 or the i860 XP 
CPU. For an 8-byte M-bus, the ROM data pins 
must be wired to the M-bus (MDATA of 82490) 
bits 7:0. For a 16-byte M-bus, the ROM must 
attach to M-bus bits 7:0 AND bits 71:64, which 
would require an 8- bit transceiver at the ROM. 

Q: Should the DRAM controller be part of the 

MBC? 

ANS: For a simple uniprocessor, perhaps. Multipro- 
cessors would have a DRAM controller for 
(each bank of) main memory, separate from the 
MBCs. 

Q: How can the system implement retry upon an 

M-bus parity error? 

ANS: The MBC must re-issue the initial request, and 
reset the 82490 transfer logic using the MSEL# 
signal. 
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Q: Can 82490 use an ECC corrected-bus? 

ANS: ECC (Error Correcting Code) can be used on 
the main memory bus, but the ECC check bits 
must be converted to parity or discarded before 
feeding the 82490. ECC would haye to be gener- 
ated at the 82490 MDAX A pins for writes to 
memory. 

Q: Can the MBC implement cache-to-cache trans- 

fer on a write? 

ANS: No, the 82490 cannot “snarf ’ write data. That 
is, it does not merge a write (partial line) from 
the M-bus with existing cached lines. It can do 
Read-For-Ownership, merging write-miss data 
with an incoming line writeback from another 
cache. 

Q: Can semaphores be cached in 82495/82490? 

ANS: Yes, but all read/ writes which are locked are 
forced onto M-bus. So the semaphore would be 
read repeatedly without locking, until it is 
“free”. Then SW would re-read it in locked fash- 
ion to obtain ownership. 

Q: Is there any advantage to making semaphores 

cacheable, if all locked accesses go to M-bus? 

ANS: Yes, SW can repeatedly read the semaphore 
without LOCKing it, and no bus traffic thus is 
generated, waiting for the release of the sema- 
phore by any other master. 

Q: Can a single multiplexed address + data bus (like 

Multibus-II) be used for M-bus? 

ANS: Yes, but transceivers external to the 82495 and 
82490 are required. 

Q: How does the MBC implement a “BACKOFF” 

when another 82495 activates MHITM#? 

ANS: If the data requested from a master 82495 is 
Modified in a snooper 82495, the master BC 
must postpone CRDY # until the modified line 
is deposited in the master 82490, after the 
snooper flushes the modified line to M-bus. 

Q: Can MBC duplicate the CPU cache tags, to 

avoid unnecessary inquire cycles? 

ANS: Yes, but the performance benefit may not war- 
rant the extra hardware. 


Q: Can i860 XP CPU Late-Backoff mode be used 

with 82495? 

ANS:: No. 

Q: What are the advantages and disadvantages of 

doing an asynchronous system (where MCLK is 
not the same as CLK)? 

ANS: Designers can easily upgrade the CPU side to 
higher frequencies (above 50 MHz) by faster 
PLDs in the CPU side of the MBC. The M-bus 
interface and all modules on the M-bus will not 
need to be changed. It easier to design a board 
when most parts run at a lower frequency. 

Q: If the 82490 is reading information from the 

memory bus and the MBC is generating 
BRDY#’s (RDYSRC= 1), can the MBC abort 
the cycle by giving a premature CRDY # , and 
restart it? 

ANS: The MBC can abort a memory bus cycle but 
cannot abort a CPUbus cycle. Once the first 
BRDY # is generated the cycle must complete. 
On the memory bus, a cycle is not aborted by 
giving an early CRDY#. In fact the 82495 does 
not understand that a cycle has been aborted. 
Only the MBC and 82490 are involved. The 
82490 allows its buffer to be reset using the 
MSEL# signal. 

Q; What is the purpose of 82490 having a separate 
MOCLK for output data, in addition to the 
MCLK for input signals? 

ANS: MOCLK allows greater hold time for writes 
from 82495, if it is skewed slightly from the 
MCLK which M-bus receivers use. MOCLK 
and MCLK must be exactly the same frequency. 
If the skew is not needed, MOCLK can be tied 
low. 

Q: How many levels of pipelining can the 82495 use 

on the external memory bus? 

ANS: Each 82495 can use one level of pipeline on the 
memory bus, so the bus pipe depth can be great- 
er in a multiprocessor. A uniprocessor allows 
just one level of M-bus pipeline. 
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APPENDIX B: 

intel486 DX CPU Uniprocessor MBC Design 


Please refer to Application Note AP-458, Designing a 
Memory Bus Controller for a 50 MHz Intel486 DX Mi- 
croprocessor Based System. (Intel order #241166), 
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APPENDIX C: 

iSeOTM XP CPU DUAL-PROCESSOR MBC 


OVERVIEW 


ASSUMPTIONS 


This section presents a design for a memory bus con- 
troller for a system containing two i860 XP processors, 
each with an 82495XP/82490XP secondary cache. This 
MBC, together with an i860 XP CPU, 82495XP, and 
82490XP, comprises a core which interacts with a 
memory bus utilizing a bus protocol similar to that of 
the i860 XP CPU. 

The design presented here features an i860 XP CPU 
and 256 KB of 82495/82490 cache running at 50 MHz 
in each core. The clocked 64 bit ( + 8 parity) memory 
bus is asynchronous to the CPU and cache clock, al- 
lowing memory to run at lower speeds for more eco- 
nomical and convenient memory design. The MBC fea- 
tures snooping and pipelining to the memory, as well as 
advanced 82495 processes like write allocation, read for 
ownership and cache-to-cache transfers. 


The implementation presented here is a two processor 
design which can be extended to more than two CPUs. 
The definitions and examples given in this appendix are 
specific to the two processor version. The section 
Extension to 3 or More Processors gives specifics for 
larger systems based on this design. 

The memory bus is 64 bits data plus 8 bits parity. 



The MBC design allows the processor to run at a high- 
er clock frequency than the memory bus. The frequen- 
cies are constrained such that the ratio of the frequency 
of the processor CLK and the frequency of the memory 
bus MCLK is between 1 and 2: 


CLK 


^ MCLK < CLK 


(BERR) PCHK« (PEN*) 


I 860 TM XP CPU 


BE7 • Of . CACHE* LEN BRDVS* 


MEM BUS 

MERR 


MBREQ 

MHOLO 

HHLOA 

MAOS* 

MBE7:0* 
MM/10, MD/C* 
HW/R*, MLEN 
MCACHE* 


MWB/WT* 

MKEN* 

MRO* 

MBOFF* 

MABOHT* 

MSWNDO* 

MSWNDI* 

MTHIT* 

MHITM* 

MSNPSTB* 

MSNPINV 

(MSNPNCA) 

MCLK 

MRESET 

MFLUSH* 

MSVNC* 


MBC PINOUT 




" — 

PIN COUNTS: 





1 CLK 


19 I860 XP CPU 

■* 

42 82495XP 


11 82490XP 


+ 38 M'Bus 


111 TOTAL 

* 




V J 


(MA31:3, MO63:0, MDP7:0) 



MBRDY* CRDY* 

[MSTBM*] 

BRDY* MSEL* MZBT* MFRZ* MDOE* 

► 

MCLK (MtSTBI 

RESET 



(MOCLK IMOSTB]) 

82495XP 

► 

MEOC* 




RESET 82490XP 

4 

BLE* 

4 

CADS* 

4 

CDTS* 

4 

SNPADS* 

4 

cw/n« 

4 

CD/C* 



CM/IO* 

* 

MCACHE* 



RDYSRC 



(CWAY) 



PALLC* 



NENE* 

4 

(SMLN*) 

* ► 

(FPFLD* (FPFLDEN*)) 



KLOCK* 

4 

CAHOLD 

4 

FSIOUT* 

» 

MADE* 

» 

MBAOE* 



MALE 



MBALE [HICH2*] 



SYNC* [MEMLDRVl 



FLUSH* INCPFLD*) 

» ' 

BGT* [C8LDRV] 

► 

KWENO* [CFQ2) 

► 

SWEND* |CF01) 



BRDY* 



CNA* (CFOO] 

! 

CRDY* (SLFTSTJ 

» ' 

SNPCLK [SNPMD] 



SNPSTB* 



SNPINV 



SNPNCA 

» 

MKEN* 

* 

MRO* 



MWB/WT* 



DRCTM* 



MTHIT* 

' 

MHITM* 

4 

SNPBSY* 



SNPCYC* 


08/01/90 


240957-28 


Figure C-1. Pinout Environment of MBC 
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This constraint ensures proper synchronization of sig- 
nals which cross between the MCLK portion of the 
MBC and the CLK portion. The prototype was de- 
signed and simulated with a CPU speed of 50 MHz and 
a memory bus speed of 33 MHz. 

Snooping mode can be independently set to strobed, or 
clocked in each core. 

The main memory is responsible for returning the 
MKEN # attribute to the memory bus controller in the 
MCLK following MADS # assertion. 

To save synchronization clocks, the MBRDY # signal 
of the protocol is defined to be asserted one MCLK 
before data is actually available. 

The 82495 operates with 32 bytes/line, 1 line/sector, 
and requires 4 memory bus transfers per line fill. 


OPTIONS 

With modifications the 82495 can operate in a mode 
with 64 bytes/line, 1 line/sector, requiring 8 memory 
bus transfers per line fill. 

The design here utilizes the 82490’s clocked memory 
bus mode. The strobed mode can also be utilized by 
making modification to the design. 

Support for various 82495 PFLD modes can be added 
to the design. 

Operation with either write-through or write-once pro- 
tocol can be performed. 


MEMORY BUS PROTOCOL 
M-bus Signals ' 

The system M-bus resembles the i860 XP CPU bus. It 
allows CPU modules with or without external cache on 
the same M-bus, so that balance between high perform- 
ance and low cost can be achieved. The signal specifica- 
tions below indicate Input (I), Output (O), or bidirec- 
tional (I/O) from the MBC’s point of view. Output 
signals to the memory bus such as MADS#, MLEN, 
and MA31:MA3 are floated by all MBCs except the 
one currently owning the M-bus. 

Signals whose names begin with Y (as in YBGT#) are 
in the MCLK side of the MBC, while an X prefixed 
name is in the CPU CLK side of the MBC. The X and 
Y signals are internal to the MBC. 


MRESET (I) - Memory bus RESET 

This signal forces the CPU to begin execution in a 
known state. It resets all MBC machines which are 
driven by MCLK. It is also synchronized (via a 2-stage 
synchronizer) to CLK and fed to the RESET inputs of 
the CPU, 82495, 82490s and all MBC machines which 
are driven by CLK. 

MADS# (I/O) - Memory bus ADdress Strobe 

This signal indicates that a new valid bus cycle is cur- 
rently being driven. The cycle address (A31:A3) and 
cycle specifications are valid in the MCLK that 
MADS# is asserted. A pipelined MADS# will be is- 
sued only after the MBC knows that the current cycle 
is guaranteed not to be aborted. For most memory ac- 
cesses, the master will assert MSNPSTB# to snoop 
other caches on the bus. When MSNPSTB# has been 
asserted, MNA# will cause a new MADS# to be is- 
sued after MSWENDI# signifies snooping has com- 
pleted. Furthermore, if MHITMI # was asserted with 
MSWENDI# in this case, the new MADS# cannot be 
issued until after the current cycle (now a snoop write- 
back) has been completed. When MHITMI# is not 
asserted with MSWENDI#, MADS# can be asserted 
immediately following MSWENDI#. If MSNPSTB# 
was not asserted for the current cycle, then MADS# 
could be issued immediately after MNA#, without 
waiting for MSWENDI#. 

For read cycles MADS# is issued after CADS#, re- 
gardless of CDTS # state. Requesting the memory bus, 
via MBREQ, is also done immediately after CADS#. 
This is due to the fact that CDTS # in a read cycle does 
not affect the memory bus, but indicates when the first 
BRDY # can be issued to the CPU. 

For memory writes MADS# is issued only after 
CDTS # . Requesting the memory bus, via MBREQ, is 
also done after CDTS # . This guarantees that for write 
cycles the memory bus data is valid 1 MCLK after 
MADS# (similar to the CPU). 

MNA# (I) - Memory bus Next Address 
Acknowledgement 

This is the memory bus next address signal, driven by 
the memory controller. It indicates to the MBC that 
the memory bus is ready to accept a new bus cycle, 
although the previous one has not been completed yet. 
If the MBC has a new cycle pending and the current 
cycle is guaranteed not to be aborted (see MADS# 
above), then a new MADS# will be issued. Note that 
the maximum level of pipelining on the memory bus is 
1 . 
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MBRDY# (I/O) - Memory bus Burst ReaDY# 

This is the burst ready signal. For read cycles, 
MBRDY # indicates that in the following MCLK the 
memory bus will present valid data on the 82490 
MDATA pins. For writes, MBRDY # indicates that in 
the following MCLK the memory bus will accept the 
data from the 82490 MDATA pins. Note that this sig- 
nal is active 1 MCLK before the data is available on the 
memory data bus. This reduces the synchronization 
penalty between the M-bus and CPUbus by 1 MCLK 
period. 

For a clocked-asynchronous MBC, MBRDY# is de- 
layed by the MBC 1 MCLK and passed to the 82490 
MBRDY # pin. For a strobed-asynchronous MBC, the 
82490 MISTB and MOSTB will change value in re- 
sponse to MBRDY # . 

For Cache to Cache Transfers, the MBC with the Mod- 
ified line drives MBRDY# active once per MCLK 
without wait states for the duration of the line burst. 

MSNPSTB# (I/O) - Memory bus SNPSTB# 

This is the memory bus snoop strobe signal. It is assert- 
ed 1 MCLK after MADS# by the MBC which asserted 
MADS#, for all cycles that could be M-state in the 
other MBC. In writebacks and I/O cycles, 
MSNPSTB# is not asserted. The MSNPSTB# output 
of each MBC is connected to the 82495 SNPSTB# in- 
put of the other MBC, in this two processor design. 

MSWENDO# (O) - Memory bus SWEND# 

Output 

This is the memory bus snoop window end indication 
which is driven by the snooping MBC. It is connected 
to the master MBC’s SWENDI # input, indicating that 
snooping is finished and the snoop attributes are valid. 

MSWENDO# is an asynchronous signal which is 
triggered by the 82495 SNPCYC# falling edge, and 
is negated after sampling an active SNPSTB#. 
MSWENDO# of one MBC is connected directly to the 
MSWENDI# input of the other MBC. 

MSWENDI# (I) - Memory bus SWEND# Input 

MSWENDI # is connected directly to the other core’s 
MSWENDO# output. It is internally sent to two syn- 
chronizers: synchronized to CLK to generate 82495 
SWEND # , and synchronized to MCLK for MBC state 
machines which determine whether the current bus cy- 
cle should be aborted. 


MSWENDI# indicates the end of the snoop window 
and that the snoop results MHITMO # and MTHIT # 
are valid. An active MHITMI# indicates a snoop hit 
to a modified line, and causes the master MBC to dis- 
card any data which has arrived from main memory, so 
that new data, which is being written out as the snoop- 
ing core performs a snoop write back, can be accepted. 
MTHIT # of each core is connected to the 
MWB/WT# input of the other core, to generate the 
WB/WT# signal to the 82495. 

MHITMO# (O) - Memory bus HUM# Output 

This indicates a snoop hit to a modified line. In the two 
processor implementation of this MBC, it is connected 
directly to the other MBC’s MHITMI # input. 

MHITMI# (I) ■ Memory bus HUM# Input 

MHITMI# is connected to the MHITMO# output of 
the other MBC, and determines if MBOFF# and 
MABORT # will be asserted. It is sampled on 
MSWENDI# activation. 


MTHIT# (O) - Memory bus Snoop Hit Indication 

This snoop hit indication is based on the 82495 
MTHIT# output. The MTHIT# ouput of the snoop- 
ing core is used by the master core to determine the 
WB/WT# state for the accessed line. The 82495 
MTHIT# signal is passed directly onto the memory 
bus when the SNPINV signal is inactive for the snoop. 
On snoops with SNPINV active, the memory bus 
MTHIT # line is driven low, regardless of the value at 
the 82495 MTHIT# pin. 

The MTHIT# signals from the memory bus control- 
lers on the bus are wire-anded together. Because the 
82495 MTHIT# output only changes state with each 
new snoop, the master memory bus controller must 
float its MTHIT#. 

MBOFF# (O) - Memory bus BOFF# 

This is the memory bus back-off signal which is driven 
by the master MBC. The master MBC floats its bus 
concurrent with MBOFF# activation. When the 
snooper MBC samples an active MBOFF # and it has a 
pending snoop write-back cycle, it issues the cycle to 
the memory bus. Note that the snooper issues the cycle 
even though it is still in a bus hold state (MHLDA 
asserted). If MHITMI # is sampled active during 
MSWENDI# and the previous cycle has completed, 
then MBOFF# will be asserted immediately after 
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MSWENDI # . If the previous cycle has not completed 
and the pipelined cycle hits a modified line, then 
MBOFF # will be asserted only after the previous cycle 
completes. The snooping MBC floats its bus only after 
the snoop write-back cycle has completed. Note that 
from the arbiter’s viewpoint the bus is still granted to 
the master MBC. 


M ABORT# (O) - Memory bus Abort 

This is the memory bus abortion signal which is driven 
by the master MBC. When the main memory samples 
an active MABORT# it aborts any cycle that is cur- 
rently being serviced. The memory aborts the cycle re- 
gardless of the number of MBRDYs that have been 
issued. Thus MBRDY # of the aborted cycle will not 
be issued after MABORT # . A new cycle could be serv- 
iced immediately after MABORT # . 

If MHITMI# is sampled active during MSWENDI# 
and the previous cycle has been completed, then MA- 
BORT# is asserted immediately after MSWENDI#. 
If the previous cycle has not been completed and the 
pipelined cycle hits a modified line, then MABORT # 
is asserted only after the current cycle has completed. 

MABORT # can also be asserted during read for own- 
ership with a hidden write (allocation after a non-com- 
pleted write in the main memory). In this case if the 
master MBC samples an active MKEN# (1 MCLK 
after MADS#) during a potentially allocatable write 
cycle, it asserts MABORT # immediately, i.e. 2 
MCLKs after MADS#. 

Note that MABORT # is always guaranteed to be a 1 
MCLK width pulse. 

MLOCK# (I/O) - Memory bus LOCK 

This signal does not exist in the current implementa- 
tion. Instead, the MBC simply refuses to give up the 
M-bus to the arbiter when it is running locked accesses. 

MHOLD (I) - Memory bus Hold Request 

When this input to the MBC is asserted, the MBC as- 
serts MHLDA and floats all inputs and outputs except 
MBREQ, MHLDA, MSWENDO#, and MBOFF#. If 
the MBC has outstanding bus cycles in progress 
(MADS# has been asserted), they are completed be- 
fore the MBC relinquishes the bus. MHOLD is recog- 
nized during MRESET assertion. 


MHLDA (O) - Memory bus Hold Acknowledge 

The memory bus hold acknowledge signal goes active 
when an MBC relinquishes the bus in response to an 
MHOLD request. The memory bus controller floats its 
bus in the same MCLK that it issues the MHLDA. 
When the MBC leaves bus hold, MHLDA is negated 
and the core resumes driving the bus. If a cycle is pend- 
ing when leaving bus hold, the MADS# will be issued 
in the same MCLK that MHLDA is negated. 

MINT (I) - Memory bus Interrupt 

This interrupt signal is connected directly to the i860 
XP CPU in the core. 


MKEN# (I) - Memory bus KEN# 

This is the memory bus cache enable signal. It is used 
by the MBC to determine the length of the current bus 
cycle, and is also connected directly to the 82495 
MKEN# input. 

In potentially cacheable read cycles, it determines cycle 
length. In potentially allocatable write cycles, it deter- 
mines whether read for ownership with hidden write 
will be performed. 

In the current implementation, MKEN # must be driv- 
en by the memory controller in the MCLK after 
MADS# was issued. 


MRO# (I) - Memory bus Read Only 

Assertion of this signal causes an access to be treated as 
read only by the core. This signal is connected directly 
to the 82495 MRO# input, as well as to the MBC. 

MWB/WT# (I) - Memory bus WB/WT# 

This is the write-back/write-through input connected 
to the memory bus. It is connected through MBC logic 
to the 82495 MWB/WT# input. 

MDRCTM (I) - Memory bus DIrect-to-M 

This is the memory bus DRCTM # signal which forces 
a line entering the cache to be placed directly in the 
[M] (modified) state. In addition to this signal which is 
connected from the memory bus to the 82495, the MBC 
can internally drive the 82495’s DRCTM# pin during 
read-for-ownership cycles. 
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MFLUSH#, MSYNC# (I) - Memory bus 
FLUSH#, SYNC# 

These signals cause the core to flush or sync its cache, 
by asserting FLUSH# or SYNC# to the 82495, re- 
spectively. The signals are driven by the main memory 
controller upon detecting a Core flush or sync com- 
mand, which consists of a special cycle with either 
MBEl# or MBE3# active, respectively. 

MBREQ (0) - Memory bus Request 


MBREQ# is not issued for snoop write-back cycles. If 
the snooping core already had its MBREQ# pin assert- 
ed, the pending cycle which caused the MBREQ# is 
aborted by the snoop write-back, according to 82495 
protocol. The MBC state machines of the snooper, 
however, continue to assert MBREQ # until an internal 
time-out period has elapsed, allowing the snooping 
82495 to reissue the aborted cycle after the snoop write- 
back has completed. Therefore a core which is waiting 
for the bus can service a snoop write-back without los- 
ing its request for the bus. 


The MBREQ# signal is asserted by an MBC to indi- 
cate to the memory bus arbiter that the MBC needs the 
memory bus. An MBC will generate this signal regard- 
less of whether or not the MBC is currently driving the 
bus. 


MLEN (O) - Memory bus LEN 

This signal together with MCACHE#, MW/R# and 
MKEN# determine the memory bus cycle length ac- 
cording to the following table: 


MW/R# 

MLEN 

MCACHE# 

MKEN# 

length 

Notes 

X 

0 

1 

X 

1 

1 

X 

1 

1 

X 

2 

1 

0 

0 

0 

1 

1 

2 

0 

1 

0 

1 

2 

2 

0 

X 

0 

0 

4 


1 

X 

0 

X 

4 




NOTES: 

1. Locked i860 XP CPU write-back cycles (length =4), caused by the i860 XP CPU executing a FLUSH instruction during a 
LOCKed sequence, are treated as normal write cycles (length = 1 or 2 according to LEN). This is allowed since i860 XP CPU 
write-back cycles always access a 82495 modified line (in [Ml state) and are only written into the 82490, without updating 
memory. 

2. MKEN# must be driven valid the clock following MADS# by the memory controller. 


MMI/0#, MD/C# (O) - Memory bus I/O# and 
D/C# 

These signals, together with MW/R#, define the mem- 
ory bus cycle, according to the i860 XP CPU Data 
Sheet. They are driven in the same MCLK as 
MADS#. 

MBE[7:0]# (O) - Memory bus BE [7:0]# 

The byte enable signals to the memory bus identify 
which bytes are being accessed. They are identical to 
the CPU byte enables on CPU generated cycles. For 
82495 generated cycles (write-backs and allocations) all 
MBE#s are asserted. 


MCACHE# (I/O) - Memory bus CACHE# 

In a master core MCACHE# is an output; in a snoop- 
ing core it is an input. As an output, it indicates poten- 
tially cacheable reads or a 82495 write-back. 
MCACHE# is used by the system memory together 
with MLEN, MW/R# and MKEN# to determine cy- 
cle length. As an input, MCACHE# is connected to 
the 82495 SNPNCA pin. 


MW/R# (I/O) - Memory bus W/R# 

This signal is an output for a master core, an input for a 
snooping core. As an output, it indicates whether the 
memory access is a read ar a write, and is used by the 
system memory along with MMI/0# and MD/C# to 
determine the cycle type, according to the i860 XP 
CPU Data Sheet. As an input, the signal is connected 
directly to the 82495 SNPINV pin. 

MA[31:3] (I/O) - Memory bus Address 

These are the memory bus address lines of the MBC. 
Along with the byte enable signals, they define the 
physical area of memory or I/O accesses. In a master 
MBC they are driven by the 82495 onto the memory 
bus together with MADS# (same MCLK). In a snoop- 
ing MBC, these lines are inputs to the 82495 which are 
latched by the MSNPSTB # signal. 
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MD[63:0], MDP[7:0] (i/0) - Memory bus Data 
and Data Parity 

64 bits of data, 8 bits of parity, connected through 
transceivers to the i860 XP CPU and 82490s. When an 
MBC does not own the bus, these pins are tristated. 

XAS#/XSAS# - X Unit Address Strobe 

X AS # is generated In the X-unit (sync to CLK), and is 
synchronized and sent to the Y-unit as XSAS#. 

X AS # indicates the start of a memory bus cycle from 
the X"Unit (CLK side). XAS# is generated as a result 
of a CADS# from the 82495 on a read cycle or 
CDTS# from the 82495 on a write cycle. XAS# is 
held active until the X-unit receives YSBGT#. 


YBGT#/YSBGT# - Memory bus Guarenteed 
Transfer 

YBGT# is generated in the Y-unit, and is synchroniz- 
ed and sent as YSBGT # to the X-unit. 

This signal is generated in the Y-unit after MADS# 
(the cycle has been issued on the memory bus). When 
YSBGT # arrives at the X-unit, the signal causes asser- 
tion of the 82495’s BGT# input, and one clock later 
(non-pipelined cycle) the assertion of KWEND#. 
YSBGT# of a pipelined cycle (which is sampled during 
the initial cycle, i.e. before its CRDY#) causes the 
BGT# and KWEND# of the pipelined cycle to be 
issued immediately after CRDY # of the initial cycle. 

YBGT# of a pipelined cycle cannot be issued before 
the MSWEND # of the previous cycle. This is guaran- 
teed by the M-bus protocol, which ensures that a pipe- 
lined MADS# is not issued until after the 
MSWEND# of the previous cycle. 

BGT#, KWEND# (O) - Bus Guaranteed 
Transfer, Cache Window End to 82495 

BGT# and KWEND# are generated for every cycle 
(including snoop write-backs). In a non-pipelined cycle 
BGT # is issued immediately after sampling YSBGT # 
active, and KWEND# is issued 1 clock later. In pipe- 
lined cycles, these signals are asserted after the 
CRDY# of the initial cycle. 


YMEOC#/YSMEOC# - (O) MBC Memory End 
Of Cycle 

YMEOC# is generated in the Y unit, and is synchro- 
nous to MCLK, and sent to the X-unit as YSMEOC# . 
It indicates the M-bus transfer has finished, based on 
the MBCs tranfer length count. YMEOC# directly 
drives the 82490s’ MEOC# inputs. YSMEOC# causes 
generation of the CRDY# signal to the 82495 and 
82490s. For non-pipelined cycles CRDY # is issued im- 
mediately after an active YSMEOC# (if CDTS# was 
issued). For pipelined cycles CRDY# is issued after 
the CRDY# of the previous cycle (if YMEOC#, 
CDTS# of the pipelined cycle were issued). 

YMEOC# is issued at least 2 MCLKs after YBGT# 
(for every cycle). 

YCEOC#/YSCEOC# - MBC CPU End Of Cycle 

This signal is internal to the MBC: YCEOC is generat- 
ed synchronous to MCLK, and is synchronized to 
CLK to produce YSCEOC#. It indicates that the 
CPUbus transfer has finished, based on the MBC’s 
tranfer length count. It generates the BRDY#s to the 
82495, 82490, CPU, and to other MBC machines. For 
non-pipelined cycles all BRDY#s except the first are 
issued immediately after an active YCEOC# (if 
CDTS# was issued). For pipelined cycles all BRDY #s 
except the first are issued after the CRDY # or the last 
■ BRDY# (BRDY# * CLENl) of the previous cycle. 

YCEOC# can be issued before, with, or 1 clock after 
YMEOC#. When the line ratio is 2 or 4, YCEOC# 
precedes YMEOC# by a significant time, allowing 
CPU linefills to complete long before the M-bus tranfer 
completes. 

YCEOC# is asserted only if RDYSRC is active 
(High). 
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BUS CYCLES 


Non-aborted Read Cycles 

Figure C-2 is a timing diagram for the memory bus 
controller executing a line fill after the i860 XP CPU 
issues a read which misses the 82495/82490. The dia- 
gram reveals a number of the signals which are internal 
to the MBC, to provide a better perspective on the tim- 
ing of events. Note that signals which begin with an M 
are MBC signals to the memory bus. Signals that begin 
with Y originate in the Y side of the MBC which is 
synchronous to MCLK, and an X denotes origin in the 
X state machines, which are synchronous to CLK. 

The i860 XP CPU microprocessor issues a read cycle in 
CLK 0, as indicated by the assertion of ADS#. The 
82495 performs the tag lookup, and finds the request a 
cache miss. In CLK 2, the 82495 issues CADS# and 
the cycle control signals, alerting the memory bus con- 
troller that a 4 transfer 82495 read is requested. 

The X side state machines, which run on the processor 
CLK, issue an XAS# on the CLK after CADS# for a 
82495 read cycle (CW/R# = 0). The XAS# signal 
passes through the synchronizer running on MCLK to 
become synchronized in two MCLKs. The synchroniz- 
ed XAS# signal, called XSAS#, is sent to the Y side of 
the MBC in MCLK 4. 

In MCLK 5, XSAS# has initiated the assertion of 
MBREQ to request the memory bus from the memory 
bus arbiter. If the bus is already owned (or once it is 
owned) by this MBC, XSAS# causes the assertion of 
MADS# to the memory bus, MADE# to the 82495, 
and the internal YBGT# signal. The assertion of the 
82495’s MADE signal allows the 82495 to drive its ad- 
dress lines to the memory bus. YBGT # indicates that 
the memory bus is owned by this MBC, and is sent to 
the synchronizer for the X side of the MBC as well as 
many Y side state machines. 

On the Y side, YBGT# is used to deassert MBREQ#, 
to sample Y ALLOC# on writes, and to initiate 
MSNPSTB#. MSNPSTB# is asserted in MCLK 6 to 
request a snoop in the other MBC. YBGT# is also 
synchronized to CLK, appearing as YSBGT#, by 
CLK 9. YSBGT # causes the assertion of BGT # to the 


82495 in CLK 10, and, 1 CLK later, KWNED#. The 
MKEN# input, which must be valid to the 82495 
when KWEND# is asserted, must be driven by the 
main memory on the MCLK after MADS# for this 
implementation. These signal activities define the initia- 
tion of normal bus cycles (as opposed to snoop write- 
backs). 

In this particular example, the memory bus responds 
quickly to the read request. Here, the memory subsys- 
tem drives MNA# to the MBC in MCLK 6, and pres- 
ents data on the memory bus in MCLK 7. Since 
MBRDY# must be driven by the memory bus 1 
MCLK before data is available, MBRDY # is asserted 
in MCLK 6, with successive MBRDY #s on the follow- 
ing MCLKs. The YMBRDY# output of the MBC is 
the MBRDY # signal delayed one clock, and drives the 
MBRDY # input on the 82490s to read in the incoming 
data. 

While the data transfer is occurring, the second memo- 
ry bus controller responds to the snoop request for this 
memory access in MCLK 8. Because the data is not 
present in the cache of the other core, that MBC will 
assert its MSWENDO# output with MHITMO# 
driven high. These outputs of the snooping core are tied 
directly to the MSWENDI# and MHITMI# inputs, 
respectively, of the master core in this two core imple- 
mentation. Both of these signals are passed to the 82495 
(MSWENDI# is synchronized first) as well as to the 
state machines of both sides of the MBC. The arrival of 
these signals allow the core to accept the data as valid, 
and conclude with the read operation when all of the 
data has been transferred. 

The arrival of the fourth MBRDY# generates the 
YMEOC# and YCEOC# signals in MCLK 10. 
YMEOC# drives the MEOC# input on the 82490s. In 
addition, both signals are synchronized and sent to the 
X side of the MBC. Upon the arrival of YSCEOC#, 
the X state machines begin generating BRDY #s to the 
i860 XP CPU. Upon arrival of YSMEOC#, CRDY# 
is driven to the 82495, indicating the end of the cycle. 
YMEOC# and YCEOC# are used to reset many of 
the Y side state machines, including cycle type and 
length indicators, and the drivers of 82490 signals such 
as YMALE# and YMSEL#. On the X side, the reset 
functions are triggered by CRDY# and the last 
BRDY#. 
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Figure C-2. Non-Aborted Read Cycles (Continued) 


2-489 




AP^452 




inl^. 


Aborted Non-Pipelined Cycles 

Figure C-3 illustrates an aborted non-pipelined cycle. 
MHITMI# is sampled active during MSWENDI# 
(clock 4) indicating a snoop hit to a modified line. Since 
the cycle is non-pipelined, M ABORT # is issued imme- 
diately and the core floats its bus (clock 5). Although 
the bus is floated by the master core, the master still 
owns the bus (MHLDA remains inactive). 

MABORT# in clock 5 causes the main memory to 
abort its cycle regardless the number of MBRDYs that 
have been issued. MBOFF # is also asserted in clock 5 
to indicate to the snooping core that the master is float- 
ing its signals and the write-back may begin. The main 
memory floats its data bus in clock 6 in response to 
MABORT#. In the following clocks a snoop write- 
back cycle is performed by the snooper. The snooper 
will release the bus at the end of the write-back. 

Note that MSNPSTB# is not asserted during the 
write-back cycle since it obviously will not hit any 
cache. 


Aborted Pipelined Cycles 

Figure C-4 illustrates an aborted pipelined cycle. Al- 
though MHITMI# is sampled active during 
MSWENDI# (clock 7) MABORT# will not be issued 
immediately since the previous cycle has not been com- 
pleted yet. MABORT# is issued in clock 9 after 


the last data slice was read into the core. The core floats 
its bus and asserts MBOFF # concurrently with 
MABORT#. Upon sampling MBOFF#, the snooping 
MBC begins the snoop write-back in clock 10. 


Write Allocate 

Figure C-5 illustrates a write cycle which is potentially 
allocatable. This write is performed on the bus only in 
order to sample the MKEN # , since the allocation cy- 
cle will only be guaranteed if MKEN # is active. 

MKEN# is sampled active in clock 2 causing the 
MABORT# to be issued immediately. The reason to 
abort the write cycle, even before MSWEND#, is due 
to the fact that a read for ownership cycle is guaranteed 
to be performed after the aborted write. 

In clock 4 the MADS# of the allocation cycle, which 
becomes the MADS # of the read for ownership cycle, 
is issued. This MADS# is issued only if MSWEND# 
has not been issued yet, or if MSWEND # was issued 
and MHITMI# was negated. If MHITMI# is asserted 
during the MSWEND# that was issued, MADS# will 
not be issued (since the snooper issues its MADS#). 

A second MABORT# is issued in clock 8 indicating 
the memory to abort the allocation, and the snooper to 
start flushing the modified line. Note that a second 
MABORT# will be issued regardless if MADS# of 
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Figure C-4. Aborted Pipelined Cycles 


the allocation was issued or not. The first MABORT # 
(clock 3) aborts the write cycle in the memory module 
and does not affect the snooper. The second 
MABORT # (clock 8) indicates to the snooper to start 
its write-back cycle (and if MADS# of an allocation 
was issued to also abort it in the memory module). 

MSNPSTB # is not issued for the allocation cycle since 
write and allocation cycles access the same line. 

If MKEN # had been negated in clock 2 then an alloca- 
tion would not have been performed and the write cycle 
would have continued as a non-allocatable write cycle 
(see figure C-6). 

Non-Allocatable Write 

Figure C-6 illustrates a write cycle without an alloca- 
tion. It can be either a non-potentially allocatable write 
cycle or a potentially allocatable write with inactive 
MKEN# (clock 1). 

The write cycle is aborted (MABORT# in clock 3) 
after sampling active MHITM# during MSWEND# 
(clock 2). In clock 11 the master core re-issues the 
MADS# of the aborted write cycle (after the snoop 
write-back has been completed). MSNPSTB# will not 
be issued again since the updated data had been written 
into the main memory and the snooper has gone to the 
invalid state. 


LIMITATIONS OF DESIGN 

The primary limitation of the implementation as it has 
been presented so far is that it includes only two proces- 
sors. The protocol set up in the design is not limited to 
two processors. The next section outlines the imple- 
mentation details which must be modified to extend the 
design to more than two processors. 

The design has no support for CSS mode, so the proces- 
sors cannot be booted from 8 bit EPROMS. Instead, 
both processors boot in 64 bit mode, which may com- 
plicate the use of the design in stand-alone systems. 

The i860 XP CPU’s BERR, or Bus ERRor, input is not 
utilized in this design. The pin could be used simply as 
a non-maskable interrupt pin, but the memory bus con- 
troller as designed makes no provision to use BERR to 
correct a faulty bus access. Likewise, the parity check 
results from the i860 XP CPU’s PCHK# pin are of 
little value in this design outside of testing the i860 XP 
CPU’s parity functions. The MBC itself does not check 
the PCHK# output, and has no means of reissuing an 
access in case of parity error. 

The memory bus controller design here does not decode 
and utilize the i860 XP CPU INTA cycles. The INT 
pin itself is connected directly to the i860 XP CPU, 
without affecting MBC operation. 
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Figure C-5. Potentially Aliocatable Write 



Figure C-6. Non-Allocatable Write 
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The Multiprocessor Interrupt Controller (MPIC) cur- 
rently being designed by Intel is not utilized in or sup- 
ported by this memory bus controller. 

The memory bus controller’s treatment of LOCKed cy- 
cles is simple but straightfoward: when the 82495 issues 
a memory access which is LOCKed (KLOCK# ac- 
tive), the MBC will not relinquish the bus until a cycle 
which is not LOCKed is issued. While this is adequate 
for simple systems, it will not suffice for dual ported 
memories, where a given block of memory can be ac- 
cessed through more than one bus. In such systems, a 
LOCK signal must be introduced to alert all possible 
simultaneous users of memory that a LOCKed access is 
in progress. 


presented here assumes that no changes are made to the 
state machines as they are written for the two processor 
system. Instead, some minor glue logic is added to three 
of the signals to make the core an element in a scalable 
multiprocessing system. However, modifying the state 
machines is also a plausible solution. 

In an implementation with three or more processors, 
the primary address, data, and cycle control lines are 
still connected to a common bus, as in the two proces- 
sor version. MCACHE# and MW/R# are also uti- 
lized in the same way as the two processor version: the 
outputs of the cores drive a common line which in turn 
also drives the 82495 SNPNCA and SNPINV inputs of 
all cores. 


EXTENSION OF DESIGN TO THREE 
OR MORE CPUs 

Two Processor Implementation 
Overview 

Figure C-7 presents a simplified view of the multipro- 
cessing signals for the two processor implementation. 
The basic address, data, and memory cycle control lines 
are attached to a common bus. Only the core which 
controls the bus will drive these signals, with all other 
cores floating these lines and asserting MHLDA#. 

When the bus master MBC issues a cycle, the 
MCACHE# and MW/R# cycle attributes also serve 
to drive the 82495s’ SNPINV and SNPNCA inputs of 
both cores. SNPSTB# is issued by the master in the 
clock following MADS # . In reality, both cores have a 
SNPSTB# output at their Y-side state machines driv- 
ing a common line which connects to the SNPSTB# 
input of both 82495s. The core which does not own the 
bus floats its state machine driver on MHLDA, so the 
signal acts only as an input in that core. The master 
drives the SNPSTB# line, but the action of SNPSTB# 
is blocked in its own 82495 because its MADE# signal 
is asserted. 

The results of the snoop are driven out on the snooping 
core’s MTHIT# and MHITMO# outputs, and 
MSWENDO# is asserted. These signals are connected 
directly to the MHITMI#, MWB/WT#, and 
MSWENDI# inputs in the master core, respectively. 

The MBOFF # signals of the two MBCs are also con- 
nected together. During MHLDA (in a snooping 
MBC) MBOFF # is an input, and in the master it is an 
output. If the master asserts MBOFF, control of the 
data and control busses is given to the snooping MBC 
so that a snoop write-back can be performed. 


Three or More Processors 

This section gives one method of extending the design 
given here to three or more processors. The solution 


The SNPSTB# signal connects directly from core to 
core in a two processor version. In an implementation 
with three or more processors, the SNPSTB# line is 
simply extended to all the processors in the system. 
Only the bus master will actually drive the line, and 
snoopers will be floating the SNPSTB# output from 
their state machines. Again, the snoop request is ig- 
nored in the master because its MAOE# is asserted. 
Similarly, the MBOFF # signal becomes a common line 
which only the master will drive and which all other 
cores will sample. 



The six signals in the upper portion of diagram C-7, 
which communicate MSWEND and the snoop results 
MHITMO# and MTHIT#, will require more glue 
logic to extend the design to three or more processors. 
The snoop results MHITMO# and MTHIT# must 
now be considered for multiple cores when a snoop has 
been issued, and the master MBC must not sample 
these results until all snooping cores have issued their 
MSWENDO#. 


To resolve these issues, common bus lines carrying 
these signals are introduced, where all cores have out- 
puts driving these lines, and inputs to sample them. The 
characteristics of such MTHIT# and MHITM# lines 
are straightforward: the line should default to 1, and if 
any core drives one of these outputs low, the line 
should be pulled low. The MTHIT # line, has the sim- 
plest solution. As shown in figure C-8, by passing the 
signal which is produced by the core through an open 
collector buffer, the buffered MTHIT #s can be tied to 
a single line which is sampled directly by all cores* 
MWB/WT # pins. The open collector buffer sinks cur- 
rent like a normal gate output to drive a logic 0, but 
instead of driving current for a logic 1, the open collec- 
tor device assumes a high impedance state for logic 1. 
Thus, if all of the cores outputs MTHIT# as 1, the 
MTHIT # line remains at a logic 1 level because of the 
pull-up resistor. If one or more cores outputs a logic 0, 
the MTHIT# line will be pulled to the logic 0 level. 
This precisely matches the desired behavior of 
MTHIT # for the system: if any 1 or more core(s) has 
the snooped data cached, the master MWB/WT# in- 
put must be asserted low. It is important to note that 
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Figure C-7. Interprocessor Communications in Two Processor System 
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the MTHIT # output of the master is floated: because 
the 82495 MTHIT # output only changes on each new 
snoop, the value of the master MTHIT # output for the 
previous snoop would erroneously be included in decid- 
ing the level of the MTHIT # line. 

The MHITM# line follows the same principle as the 
MTHIT # line. The MHITM # signal is not floated in 
the master core, and poses the problem which floating 
MTHIT # avoids: the value of the master’s last MHIT- 
MO# output is still present when the new access is 
being made. To resolve this, the inverted value of 
MHLDA is ORed with MHITMO# before going to 
the open collector buffer. The master’s MHLDA is al- 
ways a 0, so the OR gate will always guarantee a 1 
being passed from the master to the MHITM# line. 
Again, if one or more of the snooping MBCs outputs a 
logic 0, the MHITM# line will properly assume a 0 
level. 

The open collector buffer presents an easy way to add 
new MBCs to the shared lines. The desired behavior of 
a shared MSWENDA (MSWEND All) line is different 
from the attribute lines, MTHIT# and MHITM#. 
Where the master core should sample a 0 if any one or 
more snooping core(s) drives a 0 on these attribute 
lines, the master core must not receive its MSWEN- 
DI# indication until all cores in the system have as- 
serted their MSWENDO# output. The answer is to 


invert the MSWENDO output of each snooper, so that 
a zero is driven onto the MSWENDA line when the 
snoop is being performed, and a one is output if the 
snoop has completed. From the MSWENDI # perspec- 
tive, MSWENDI # should not be asserted at the master 
core if any snooping core is still driving a zero on the 
MSWENDA line (is not done snooping). Therefore, the 
MSWENDA line is the opposite logic polarity of the 
actual MSWENDO# signal. The master samples 
MSWENDA after the signal passes through an invert- 
er, to recorrect the logic level. The output of each core 
is passed through inverter before going to the open col- 
lector buffer. The inverting device is a NAND gate be- 
cause the SWENDO# signal shares the problem of 
MHITM#, and must be “faked” by the master. In this 
case, instead of the last snoop’s results causing the 
problem, the master’s SWENDO# signal is reset to 1 
(still snooping) when the SNPSTB # line is asserted. 

Again, these simple adaptations can be implemented in 
a similar manner in the logic of the state machines. The 
MHITMO # line can be forced to a logic one or floated 
when the core is a master (after YBGT, for example). 
The MSWEND signal might be implemented as an as- 
serted-high system signal, if open collector buffers are 
used to attach new cores to the shared system bus. 
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STATE MACHINES AND SCHEMATICS 
STATE DIAGRAMS 
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CRDY.WCPLB 

CRDY . WCPLB# . PBGT# . YSBGT# 



CNADIS#. YSBGT.WCPLB# 
CNADIS# . BGT . WCPLB 


CRDY# 

(PBGT + YSBGT) , WCPLB# 


RESET . SLFTST# 


XCRDY 


YSMEOC.WSDTS# 



CBSTC# 
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RESET . TR4# 
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PXSAS = XSAS.ENXSAS.XSNPWB# 
PSWBAS = XSAS.ENXSAS.XSNPWB 


YENXSAS 
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♦YPIPE.YMEOC.MWR# 

+YNOPIPE#.YPIPE#.MHLDA# 
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tv 


YMEOC#.(YNOPIPE+YPIPE+MHLDA) 
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MBRO.MLEN 1 .MBRDY.WMSWND.MABORT# 

+MBR 1 .MLEN 1 .WMSWND.MABORT# 

+MBR 1 .MLEN2.MBR0Y.WMSWND,MAB0RT# 

+MBR2.MLEN2.WMSWND.MAB0RT# 

+MBR3.MLEN4.MBRDY.WMSWND.MAB0RT#.TRA 

+MBR4.MLEN4.WMSWND.MAB0RT#JR4 

+MBR7.MLEN4.MBRDY.WMSWND.MABORT# 

•I-MBR8.MLEN4.WMSWND.MAB0RT# 

+YALLOC.M ABORT 

{/YMFRZ IF YALLOC.MABORT} 
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PLD CODES 


; Declaration Segment 

TITLE AYMBTRCK 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL • 

DATE 2/4/91 
CHIP xOl 85C22vlO 

; This PLD contains the YMBTRCK state machine. 


PIN 

1 

PIN 

2 

PIN 

3 

PIN 

4 

PIN 

5 

PIN 

6 

PIN 

7 

PIN 

8 

PIN 

9 

PIN 

10 

PIN 

11 

PIN 

12 

PIN 

13 

PIN , 

14 

PIN 

15 

PIN 

16 

PIN 

17 

PIN 

18 

PIN 

19 

PIN 

20 

PIN 

21 

PIN 

22 

PIN 

23 

PIN 

24 

EQUATIONS 


MCLK 

MRESET 

/WMSWND 

/MBOFFI 

/PXSAS 

/PSWBAS 

MHOLD 

/MNA 

/WMNA 

/YMLOCK 

/YSWEHITM 

GND 

/PCTCXFR 

/RSTRT 

/YMEOC 

UNUSED 

/YBGT 

/YMADS 

/MAOE 

/YNOPIPE 

/YMSTR 

/YPIPE 

/YMSEL 

VCC 


PIN Declarations 

COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 

COMBINATORIAL ; 
COMBINATORIAL ; 
COMBINATORIAL ; 
registered ; 
registered ; 

registered ; 
registered ; 
registered ; 
registered ; 
registered ; 
registered ; 


Boolean Equation Segment 


YNOPIPE := /MRESET ^ PXSAS * /MHOLD YMEOC * YNOPIPE 
+ /MRESET * PXSAS * YMLOCK * YMEOC * YNOPIPE 
+ /MRESET * /PXSAS * /YMEOC * /YSWEHITM * YNOPIPE 
+ /MRESET * /YMEOC * /WMSWND * /YSWEHITM * YNOPIPE 
+ /MRESET * YMEOC * /YSWEHITM * /PCTCXFR * YPIPE 
+ /MRESET * /PCTCXFR * RSTRT * YMSTR * /MADE 
+ /MRESET * /MNA * /WMNA * /YMEOC * /YSWEHITM * YNOPIPE 
+ /MRESET * MHOLD * /YMLOCK * /YMEOC * /YSWEHITM * YNOPIPE 
+ /MRESET * PXSAS * /MHOLD * /PCTCXFR * YMSTR * /MAOE 

+ /MRESET * PXSAS * YMLOCK * /PCTCXFR * YMSTR * /MAOE 

+ /MRESET * PXSAS /MHOLD ‘-v /MBOFFI * /YMSTR * /MAOE 

+ /MRESET * PXSAS * /MHOLD * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 

* YMSTR 

+ /MRESET * PXSAS * YMLOCK * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 

* YMSTR 


YPIPE 


:= /MRESET 
+ /MRESET * 
YNOPIPE 
+ /MRESET * 

* YNOPIPE 

.+ /MRESET * 

* YNOPIPE 

+ /MRESET * 

* YNOPIPE 


* /YMEOC * YPIPE 

PXSAS * /MHOLD * MNA * /YMEOC ‘A- WMSWND * /YSWEHITM 
PXSAS * /MHOLD * WMNA * /YMEOC * WMSWND * /YSWEHITM 
PXSAS * MNA * YMLOCK * /YMEOC * WMSWND * /YSWEHITM 
PXSAS * WMNA * YMLOCK * /YMEOC * WMSWND * /YSWEHITM 


YMSTR ;= /MRESET * YPIPE 
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+ /MRESET * /YMEOC * YNOPIPE 
+ /MRESET' * YSWEHITM * YNOPIPE 
+ /MRESET * /MHOLD * YMSTR 
+ /MRESET * YMLOCK YMSTR 
+ /MRESET * /MHOLD * /MBOFFI * /MAOE 
+ /MRESET * PCTCXFR * YMSTR * /MAOE 
+ /MRESET * RSTRT * YMSTR * /MAOE 

MAOE := /MRESET * /PCTCXFR * RSTRT * YMSTR * /MAOE 

+ /MRESET * /YMEOC * YPIPE 
+ /MRESET * YMLOCK * YMEOC * YNOPIPE 
+ /MRESET * /YMEOC * /YSWEHITM * YNOPIPE 
+ /MRESET * /YSWEHITM * /PCTCXFR * YPIPE 
+ /MRESET * /YMEOC * /YMSTR * MAOE 
+ /MRESET * YMLOCK * /YSWEHITM * /PCTCXFR * YMSTR 
+ /MRESET * /MHOLD * /PCTCXFR * YMSTR * /MAOE 

+ /MRESET * YMLOCK * /PCTCXFR * YMSTR * /MAOE 

+ /MRESET * /MHOLD * /MBOFFI * /YMSTR * /MAOE 

+ /MRESET * PSWBAS * MBOFFI /YMSTR * /MAOE 

+ /MRESET * /MHOLD * /YMLOCK * /YMEOC * /YNOPIPE * MAOE 
+ /MRESET * /MHOLD * /YMLOCK * YMEOC * /YPIPE * YMSTR * MAOE 

+ /MRESET * /PXSAS * YMLOCK * /YNOPIPE * /YPIPE * YMSTR * MAOE 

YMADS := /MRESET * PXSAS * /MHOLD * YMEOC YNOPIPE 

+ /MRESET * PXSAS YMLOCK * YMEOC * YNOPIPE 

+ /MRESET * /PCTCXFR * RSTRT * YMSTR * /MAOE 

+ /MRESET * PXSAS * /MHOLD * /PCTCXFR * YMSTR * /MAOE 

+ /MRESET * PXSAS * YMLOCK * /PCTCXFR * YMSTR * /MAOE 

+ /MRESET PXSAS * /MHOLD * /MBOFFI ^ /YMSTR * /MAOE 

+ /MRESET * PXSAS /MHOLD * /YSWEHITM * /PCTCXFR /YNOPIPE * /YPIPE 

* YMSTR 

+ /MRESET * PXSAS * YMLOCK * /YSWEHITM * /PCTCXFR >v /YNOPIPE /YPIPE 
^ YMSTR 

+ /MRESET * PSWBAS * MBOFFI * /YMSTR ^ /MAOE 

+ /MRESET * PXSAS * /MHOLD * MNA * WMSWND * /YSWEHITM * YNOPIPE 

+ /MRESET * PXSAS * /MHOLD * WMNA * WMSWND * /YSWEHITM * YNOPIPE 

+ /MRESET * PXSAS * MNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 
+ /MRESET * PXSAS * WMNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 

YBGT := /MRESET * PXSAS * /MHOLD * YMEOC * YNOPIPE 

+ /MRESET * PXSAS * YMLOCK * YMEOC * YNOPIPE 

+ /MRESET * PXSAS * /MHOLD * /MBOFFI * /YMSTR /MAOE 

+ /MRESET * PSWBAS * MBOFFI /YMSTR * /MAOE 

+ /MRESET * PXSAS * /MHOLD * MNA * WMSWND * /YSWEHITM * YNOPIPE 

+ /MRESET * PXSAS /MHOLD * WMNA WMSWND /YSWEHITM * YNOPIPE 

+ /MRESET * PXSAS * MNA YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 
+ /MRESET * PXSAS * WMNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 

+ /MRESET * PXSAS * YMLOCK * /YNOPIPE * /YPIPE * YMSTR * MAOE 

+ /MRESET PXSAS * /MHOLD /PCTCXFR ^ /RSTRT YMSTR * /MAOE 

+ /MRESET * PXSAS > YMLOCK * /PCTCXFR * /RSTRT * YMSTR * /MAOE 

+ /MRESET * PXSAS /MHOLD ^ /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 
’V YMSTR * MAOE 

YMSEL := /MRESET * /YMEOC YPIPE 

+ /MRESET * /YMEOC * /YSWEHITM * YNOPIPE 
+ /MRESET * /YSWEHITM /PCTCXFR * YPIPE 

+ /MRESET * /YMEOC * WMSWND PCTCXFR * /RSTRT YMSTR ‘A- /MAOE 


UNUSED := VCC 
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: Declaration Segment 

TITLE AYMEMLEN 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 
CHIP xOl 85C22V10 

; This PLD contains the YMEMLEN and YCPUEOC state machines 


PIN Declarations 


PIN 

1 

MCLK 

COMBINATORIAL ; 


PIN 

2 

MRESET 

COMBINATORIAL 


PIN 

3 

/YMSEL 

COMBINATORIAL 


PIN 

4 

/YPIPE 

COMBINATORIAL 


PIN 

5 

/MABORT 

COMBINATORIAL 


PIN 

6 

/MBRDY 

COMBINATORIAL 


PIN 

7 

/WMSWND 

COMBINATORIAL 


PIN 

8 

XLRDYSRC 

COMBINATORIAL ; 

PIN 

9 

/XLKCACHE 

COMBINATORIAL ; 

PIN 

10 

/MKEN 

COMBINATORIAL ; 


PIN 

11 

LEN 

COMBINATORIAL ; 


PIN 

12 

GND 



PIN 

13 

/CACHE 

COMBINATORIAL 

INPUT 

PIN 

14 

/YMEOC 

COMBINATORIAL 

; INPUT 

PIN 

15 

/SVRO 

COMBINATORIAL 

INPUT 

PIN 

16 

/SVRl 

COMBINATORIAL 

INPUT 

PIN 

17 

/SVR2 

COMBINATORIAL 

INPUT 

PIN 

18 

/SVR3 

COMBINATORIAL 

INPUT 

PIN 

19 

/YCEOC 

registered ; 


PIN 

20 

/SVLO 

registered ; 


PIN 

21 

/SVLl 

registered ; 


PIN 

22 

/SVCO 

registered ; 


PIN 

23 

/SVCl 

registered ; 


PIN 

24 

VCC 







EQUATIONS 




SVLl 

: = 

/YMEOC * /MRESET * SVLl 



+ YPIPE * LEN * /XLKCACHE * /MRESET * SVLl 

+ YMSEL * LEN * /MRESET * /SVLl * /SVLO 

+ YPIPE ^ LEN * XLRDYSRC * /MKEN * /MRESET * SVLl 

+ YPIPE * YMEOC * LEN * /XLKCACHE ^ /MRESET SVLO 

+ YMSEL * /XLRDYSRC * XLKCACHE * /MRESET * /SVLl * /SVLO 

+ YMSEL * XLKCACHE * MKEN * /MRESET * /SVLl * /SVLO 
+ YPIPE * YMEOC * LEN * XLRDYSRC * /MKEN * /MRESET * SVLO 


SVLO := YMSEL * /XLRDYSRC * XLKCACHE * /MRESET * /SVLl * /SVLO 
+ YMSEL * XLKCACHE * MKEN * /MRESET /SVLl * /SVLO 
+ /YMEOC * /MRESET * SVLO 
+ YPIPE /LEN * /XLKCACHE * /MRESET * SVLO 
+ YMSEL * /LEN * /MRESET * /SVLl * /SVLO 
+ YPIPE * YMEOC * /LEN * /XLKCACHE * /MRESET * SVLl 
+ YPIPE ^ /LEN * XLRDYSRC * /MKEN * /MRESET * SVLO 
+ YPIPE ’V YMEOC /leN * XLRDYSRC /MKEN * /MRESET * SVLl 


SVCl := /YMEOC * /YCEOC * /MRESET * SVCl 

+ YMSEL ’V XLRDYSRC * /MRESET * /SVCl * /SVCO 
+ YPIPE * YMEOC * /CACHE * XLRDYSRC * /MRESET 'A- SVCl 
+ YPIPE * YMEOC * /CACHE * XLRDYSRC * /MRESET * SVCO 


SVCO := /YMEOC * /MRESET * SVCO 

+ /YMEOC * YCEOC * /MRESET * SVCl 
+ YPIPE * /XLRDYSRC * /MRESET * SVCO 
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+ YPIPE * YMEOC * /XLRDYSRC /MRESET * SVCl 
+ YMSEL * CACHE * /MRESET * /SVCl * /SVCO 
+ YMSEL * /XLRDYSRC * /MRESET * /SVCl * /SVCO 

YCEOC := SVR3 * /SVR2 * /SVRl * SVLl * SVLO * SVCl * WMSWND 

* /MABORT * /MRESET * /YCEOC 

+ SVR3 * /SVRl * /SVRO * SVLl * SVLO * SVCl * WMSWND 

* /MABORT * /MRESET * /YCEOC 


+ /SVR3 

•k 

/SVR2 * 

SVRl * /SVRO * SVLl ^ 

■ /SVLO 

* SVCl 

* WMSWND 

* 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

* 

/SVR2 * 

/SVRl * SVRO * /SVLl 

* SVLO 

* SVCl 

WMSWND 

•k 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

•k 

SVR2 * SVRl * /SVRO 

k SVLl * 

SVLO 

SVCl 

* WMSWND 


/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

* 

/SVR2 * 

SVRl * SVRO 

k SVLl * 

SVLO * 

SVCl 

* WMSWND 

■* 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

k 

/SVR2 * 

SVRl * /SVRO * SVLl ^ 

• SVCl * /SVCO 

* WMSWND 

k 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

k 

SVR2 * /SVRO * SVLl 

* SVLO * 

SVCl * 

/SVCO 

* WMSWND 

k 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

k 

/SVR2 * 

/SVRl * /SVLl * SVLO 

* SVCl 

* MBRDY 

* WMSWND 

k 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

k 

SVR2 * /SVRO * SVLl 

k SVLO * 

SVCl * 

MBRDY 

* WMSWND 

k 

/MABORT 

* /MRESET * 

/YCEOC 



+ /SVR3 

k 

/SVR2 * 

/SVRl * SVRO * SVLl ^ 

r /SVLO 

* SVCl 


'A' MBRDY * WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * /SVRl * SVRO * SVLl * SVCl * /SVCO 

* MBRDY * WMSWND * /MABORT * /MRESET * /YCEOC 
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; Declaration Segment 

TITLE BYRDYSTR 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 
CHIP xOl 85C22V10 


This PLD contains the YRDYSTR, YRDYSTR, and YMEMEOC state machines. 


PIN 1 
PIN 2 
PIN 3 
PIN 4 

; PIN 5 #### 
PIN 6 
PIN 7 
PIN 8 
PIN 9 
PIN 10 
PIN II 
PIN 12 
PIN 13 
PIN 14 
PIN 15 
PIN 16 
PIN 17 
PIN 18 
PIN 19 
PIN 20 
PIN 21 
PIN 22 
PIN 23 
PIN 24 


EQUATIONS 


PIN Declarations 

MCLK COMBINATORIAL ; 

MRESET COMBINATORIAL ; 

TR4 COMBINATORIAL ; 

/YALLOC COMBINATORIAL ; 


/MABORT 

/MBRDY 

/WMSWND 

/MBOFFI 

/YMSEL 

/YPIPE 

GND 

/SVLl 

/SVLO 

/SVR3 

/SVR2 

/SVRl 

/SVRO 

/YMEOCl 

/YMEOC 

/YMFRZ 

/CTCEND 

/SV 

VCC 


COMBINATORIAL ; 

COMBINATORIAL ; 

COMBINATORIAL ; 

COMBINATORIAL ; 

COMBINATORIAL ; 

COMBINATORIAL ; 

COMBINATORIAL ; INPUT 

COMBINATORIAL ; INPUT 
registered ; 
registered ; 
registered ; 
registered ; 
registered ; 
registered ; 
registered ; 
registered ; 
registered ; 

Boolean Equation Segment 


SVR3 = /MRESET * /YMEOC * /MBRDY /MABORT * SVR3 
+ /MRESET * MBRDY * /MABORT * /MBOFFI * SVR2 
+ /MRESET * /MBRDY * /MABORT * SVR3 * SVR2 
+ /MRESET * MBRDY * /MABORT * SVR2 * SVRl 
+ /MRESET * /YMEOC * /MABORT SVR3 * SVRO 
+ /MRESET * MBRDY * /MABORT * /TR4 * /SVR3 * SVR2 


SVR2 = /MRESET * /MBRDY * /MABORT * SVR2 
+ /MRESET --Sr /MABORT * SVR2 * SVRl 

+ /MRESET * /YMEOC * MBRDY * /MABORT * SVRl 

+ /MRESET * MBRDY * /MABORT * MBOFFI * SVRl 

+ /MRESET * MBRDY * /MABORT SVRl * SVRO 


SVRl 


= /MRESET * 
+ /MRESET * 
+ /MRESET * 
+ /MRESET * 
+ /MRESET * 
+ /MRESET * 
+ /MRESET * 


/MABORT * SVRl * SVRO 

/YMEOC * /MBRDY * /MABORT * SVRl 

/MBRDY * /MABORT MBOFFI * SVRl 

/MBRDY * /MABORT * SVR2 * SVRl 

/YMEOC * MBRDY * /MABORT * /SVR3 * SVRO 

MBRDY * /MABORT * MBOFFI * /SVR3 * SVRO 

/YMEOC * MBRDY * /MABORT * SVR3 * /SVR2 * /SVRO 


SVRO = /MRESET * MBRDY * /MABORT * /MBOFFI * SVR3 

+ /MRESET * MBRDY /MABORT * SVR3 * /SVR2 

+ /MRESET * /YMEOC * /MBRDY * /MABORT * SVRO 
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+ /MRESET * /MBRDY * /MABORT * SVRI * SVRO 

+ /MRESET * /MBRDY * /MABORT ^ MBOFFI * /SVR3 * SVRO 

+ /MRESET * YPIPE * YMEOC t MBRDY ^ /MABORT * /SVRI SVRO 

+ /MRESET * YMSEL * MBRDY * /MABORT 'A- /SVR2 * /SVRI * /SVRO 

+ /MRESET * MBRDY * /MABORT * MBOFFI * /SVR2 * /SVRI * /SVRO 
+ /MRESET * YPIPE * YMEOC * MBRDY * /MABORT * /SVR2 * SVRI 
/SVRO 

CTCEND = /MRESET * MBRDY * /MABQRT * MBOFFI * SVR3 * SVR2 

+ /MRESET * MBRDY * /MABORT * MBOFFI * TR4 * SVR2 * /SVRI 

YMEOC = /MRESET * MABORT * YALLOC * /YMEOC * /SV 

+ /MRESET *VSVR3 * /SVR2 * SVRI * /SVRO * SVLl * /SVLO 

* WMSWND * /MABORT /YMEOC * /SV 

+ /MRESET * /SVR3 * /SVR2 * /SVRI * SVRO * /SVLl * SVLO 

* WMSWND * /MABORT A- /YMEOC * /SV 

+ /MRESET * SVR3 * /SVR2 * /SVRI * SVRO * SVLl * SVLO 
'A WMSWND * /MABORT * /YMEOC * /SV 
+ /MRESET ^ SVR3 * /SVR2 * /SVRI * SVLl SVLO * TR4 WMSWND 

* /MABORT * /YMEOC * /SV 

+ /MRESET * /SVR3 * /SVR2 * /SVRI * /SVLl * SVLO * MBRDY 

* WMSWND /MABORT * /YMEOC * /SV 

+ /MRESET * /SVR3 * /SVR2 ^ /SVRI * SVRO * SVLl * /SVLO 

* MBRDY ^ WMSWND >v /MABORT * /YMEOC * /SV 

+ /MRESET * SVR3 * SVR2 * /SVRI * /SVRO '-v SVLI SVLO 

* MBRDY * WMSWND * /MABORT * /YMEOC ’V /sv 

+ /MRESET * SVR2 * /SVRI * /SVRO * SVLl * SVLO * TR4 * MBRDY 

* WMSWND * /MABORT * /YMEOC * /SV 

SV = /MRESET * YMEOC 

/YMFRZ = MRESET 
+ /MABORT 
+ /YALLOC 
+ YMEOC 
+ SV 

YMEOCl = /MRESET * MABORT YALLOC * /YMEOCl /SV 

+ /MRESET * /SVR3 * /SVR2 * SVRI ^ /SVRO * SVLl * /SVLO 
WMSWND * /MABORT 'A /YMEOCl * /SV 
+ /MRESET * /SVR3 * /SVR2 * /SVRI * SVRO * /SVLl * SVLO 
>v WMSWND * /MABORT >v /YMEOCl * /SV 
+ /MRESET * SVR3 * /SVR2 * /SVRI * SVRO ^ SVLl * SVLO 

* WMSWND * /MABORT /YMEOCl * /sv 

+ /MRESET * SVR3 * /SVR2 * /SVRI * SVLl * SVLO * tR4 * WMSWND 

* /MABORT * /YMEOCl * /SV 

+ /MRESET * /SVR3 * /SVR2 * /SVRI * /SVLl ‘-5^ SVLO * MBRDY 

* WMSWND * /MABORT * , /YMEOCl * /SV 

+ /MRESET * /SVR3 * /SVR2 * /SVRI * SVRO ^ SVLl * /SVLO 

* MBRDY WMSWND * /MABORT * /YMEOCl * /SV 

+ /MRESET * SVR3 * SVR2 * /SVRI * /SVRO * SVLl * SVLO 

* MBRDY * WMSWND * /MABORT * /YMEOCl * /SV 

+ /MRESET * SVR2 /SVRI /SVRO * SVLl ‘A- sVLO * TR4 MBRDY 
‘A WMSWND ‘A /MABORT * /YMEOCl ^ /SV 
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TITLE 

EABORT 



PATTERN A 




REVISION 2, 

3 



AUTHOR ISIC SILAS 



COMPANY INTEL 



DATE 

2/5/91 



CHIP 

xOl 85C224 




This PLD contains the YABORT, YRSTRT, and YMEMDOE state machines. 












PIN 

1 

MCLK 



PIN 

2 

MRESET 



PIN 

3 

WMSWND 



PIN 

4 

YSWEHITM 



PIN 

5 

YALLOC 



PIN 

6 

YPIPE 



PIN 

7 

YNOPIPE 



PIN 

8 

YMEOC 



PIN 

9 

MHITMI 



PIN 

10 

MAOE 



PIN 

11 

CTCEND 



PIN 

13 

MKEN 



PIN 

14 

MHLDA 



PIN 

15 

YWR 



PIN 

16 

YMADS 



PIN 

23 

CTCDIS 



PIN 

18 

RSTRT 



PIN 

19 

YMDOE 



PIN 

20 

PCTCXFR 



PIN 

21 

TRIABORT 



PIN 

22 

MABORT 



PIN 

17 

SV 

; Swapped pins 23 and 17 to fit 85C224 


EQUATIONS 




/RSTRT.D 

/MRESET * 

/PCTCXFR * /CTCDIS 



+ 

/MRESET * 

/PCTCXFR * /RSTRT 



+ 

/MRESET * 

/YSWEHITM * /CTCDIS RSTRT 



+ /MRESET * 

/YSWEHITM * YWR * YALLOC * RSTRT 


RSTRT 

CLKF 

= MCLK 



RSTRT 

RSTF - GND 



RSTRT 

SETF =■ GND 



RSTRT 

TRST - VCC 



/YMDOE.D 

/MRESET * 

YWR * /YPIPE * /YMEOC 



+ /MRESET * 

/YNOPIPE * YMEOC * /YMDOE 



+ 

/MRESET * 

MHLDA * YMEOC * /YMDOE 



+ 

/MRESET * 

/YPIPE * YMEOC * /YMDOE 



+ /MRESET * 

/YMADS * YWR * /YNOPIPE * YMDOE 



+ 

/MRESET * 

/YMADS * YWR * MHLDA * YMDOE 


YMDOE 

CLKF 

= MCLK 



YMDOE 

RSTF 

= GND 



YMDOE 

SETF 

= GND 



YMDOE 

TRST 

- VCC 



/PCTCXFR.D 

:= /MRESET 

* YALLOC * /MABORT 




+ /MRESET 

* /YSWEHITM /MAOE PCTCXFR 




+ /MRESET 

* /MHITMI * /WMSWND * /MABORT 




+ /MRESET 

* CTCEND * /PCTCXFR * MABORT 




+ /MRESET 

* /PCTCXFR * MABORT * /SV 
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+ /MRESET * /YSWEHITM * /YPIPE * YMEOC * PCTCXFR 
+ /MRESET * /YPIPE * /YMEOC * /YALLOC * PCTCXFR 
+ /MRESET * /YNOPIPE '-v YMEOC /YALLOC * /MKEN * PCTCXFR 
PCTCXFR. CLKF = MCLK 
PCTCXFR. RSTF = GND 
PCTCXFR. SETF •= GND 
PCTCXFR. TRST = VCC 

/TRIABORT. D := /MRESET * /YPIPE ^ /YMEOC * /YALLOC * PCTCXFR * /MHLDA 
+ /MRESET * /YNOPIPE * YMEOC * /YALLOC * /MKEN * PCTCXFR 
* /MHLDA 

+ /MRESET * /YSWEHITM * /MAOE * YPIPE * PCTCXFR ^ /MHLDA 
+ /MRESET * /YSWEHITM ^ /MAOE * /YMEOC * PCTCXFR * /MHLDA 

+ /MRESET * /YMEOC ^ /PCTCXFR * MABORT * /SV * /MHLDA 

TRIABORT.CLKF = MCLK 
TRI ABORT. RSTF = GND 
TRIABORT. SETF = GND 
TRIABORT.TRST = /MHLDA 

/MABORT. D := /MRESET * /YPIPE * /YMEOC * /YALLOC * PCTCXFR 

+ /MRESET * /YNOPIPE ^ YMEOC * /YALLOC * /MKEN * PCTCXFR 
+ /MRESET * /YSWEHITM /mOE YPIPE * PCTCXFR 
+ /MRESET * /YSWEHITM * /MAOE * /YMEOC * PCTCXFR 

+ /MRESET '-V /YMEOC * /PCTCXFR * MABORT * /SV 

MABORT. CLKF = MCLK 
MABORT. RSTF = GND 
MABORT. SETF = GND 
MABORT. TRST = VCC 

/SV.D := MABORT * /SV 
+ PCTCXFR 
SV.CLKF = MCLK 
SV.RSTF = GND . 

SV.SETF = GND 
SV.TRST == VCC 
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; Declaration Segment 

TITLE EASTB 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 


CHIP xOl 85C224 

; This PLD contains the XASTB, XSTFAIL, and XDTSTRCK state machines 

; Pin Declarations 


PIN 

1 

CLK 

PIN 

2 

RESET 

PIN 

3 

CADS 

PIN 

4 

CDTS 

PIN 

5 

SNPADS 

PIN 

6 

CWR 

PIN 

7 

YSBGT 

PIN 

8 

CRDY 

PIN 

9 

CAHOLD 

PIN 

10 

FSIOUT 

PIN 

11 

SLFTST 

PIN 

13 

OEx 

PIN 

14 

RDYSRC 

PIN 

16 

LRDYSRC 

PIN 

17 

SV2 

PIN 

18 

STFAIL 

PIN 

19 

SVl 

PIN 

20 

XSNPWB 

PIN 

21 

WSDTS 

PIN 

22 

XAS 


; OE control inverted during design conversion. 

EQUATIONS 

LRDYSRC.D RDYSRC 
LRDYSRC.CLKF - CLK 
LRDYSRC.RSTF - GND 
LRDYSRC.SETF - GND 
/LRDYSRC.TRST - OEx 

/SV2.D :» /RESET * FSIOUT * /SLFTST ^ STFAIL * SV2 

SV2.CLKF - CLK 

SV2.RSTF - GND 

SV2.SETF - GND 

/SV2.TRST « OEx 

/STfAIL.D /RESET * FSIOUT * /CAHOLD * /SLFTST * STFAIL * SV2 

STFAIL. CLKF - CLK 

STFAIL. RSTF - GND 

STFAIL. SETF - GND 

/STFAIL. TRST - OEx 

/SVl.D /RESET * CDTS * CRDY * /SVl 
+ /RESET * CDTS * WSDTS * /SVl 
+ /RESET * /SNPADS * XSNPWB * SVl 
+ /RESET * /CDTS * CRDY * /WSDTS * XSNPWB 
S VI. CLKF = CLK 
SVl. RSTF - GND 
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SVl.SETF = GND 
/SVl.TRST - OEx 


/XSNPWB.D := /RESET * CRDY * /XSNPWB 

+ /RESET * /CDTS * WSDTS * /SVl 

XSNPWB. CLKF = CLK 

XSNPWB. RSTF = GND 

XSNPWB. SETF - GND 
/XSNPWB. TRST = OEx 


/WSDTS. D /RESET * CRDY ^ /XSNPWB 

+ /RESET * /CDTS * XSNPWB 
+ /RESET * /WSDTS ^ /SVl 
+ /RESET * SNPADS * CRDY * /WSDTS 

WSDTS. CLKF - CLK 

WSDTS. RSTF =» GND 

WSDTS. SETF - GND 
/WSDTS. TRST = OEx 


/XAS.D /RESET * SNPADS * YSBGT * /XAS 

+ /RESET * /CDTS * CWR * XAS 
•f /RESET * /CADS * /CWR XAS 

XAS. CLKF = CLK . 

XAS. RSTF = GND 

XAS. SETF = GND 
/XAS. TRST » OEx 
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TITLE 


EBGTKWN 


PATTERN 



REVISION 

1.0 


AUTHOR 



COMPANY 

INTEL 


DATE 




CHIP 

INTEL 85C224 



This PLD contains the XBGTKWND, XCNA and XENBGT state machines 




^ 






PIN 

1 

CLK 


PIN 

2 

RESET 


PIN 

3 

YSBGT 


PIN 

, 4 

CRDY 


PIN 

5 

C8LDRV 


PIN 

6 

TR4 


PIN 

7 

NC5 


PIN 

8 

NC6 


PIN 

9 

WCPLB 


PIN 

10 

CNADIS 


PIN 

11 

NCI 


PIN 

13 

OE 


PIN 

14 

NC2 


PIN 

15 

NC3 


PIN 

23 

NC4 


PIN 

16 

CKENLC 


PIN 

17 

ENBGT 


PIN 

18 

CNA 


PIN 

19 

PBGT 


PIN 

20 

KWEND 


PIN 

21 

C5BGT 


PIN 

22 

BGT 


EQUATIONS 



/CKENLC . D 

:= /RESET * YSBGT * CRDY * /BGT * /KWEND 




+ /RESET * CRDY ENBGT * /BGT * /KWEND 


CKENLC 

.CLKF = CLK 


CKENLC 

.RSTF = GND 


CKENLC 

.SETF = GND 


/CKENLC. TRST = OE 


ENBGT . 

D : = 

/RESET * /YSBGT 


ENBGT . 

CLKF = CLK 


ENBGT . 

RSTF =* GND 


ENBGT. 

SETF » GND 


/ENBGT 

.TRST * OE 


/CNA . D 

: = 

/RESET * CRDY * /CNA 



+ 

/RESET * /YSBGT * WCPLB CNADIS 



' + 

/RESET * /YSBGT * WCPLB * /CNA 



+ 

/RESET * /PBGT * WCPLB * /CNA 



+ 

/RESET * /BGT * WCPLB * CNADIS * CNA 


CNA.CLKF = 

CLK 


CNA.RSTF = 

GND 


CNA.SETF = 

GND 


/CNA. TRST 

= OE 


/PBGT. 

D : = 

/RESET * CRDY * /PBGT 
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+ /RESET * /YSBGT * 
PBGT.CLKF = CLK 
PBGT.RSTF = GND 
PBGT.SETF » GND 
/PBGT.TRST » OE 


CRDY * /ENBGT * /BGT * /KWEND 


/KWEND.D /RESET * /BGT * KWEND 
+ /RESET ^ /CRDY * /PBGT 
+ /RESET * YSBGT * CRDY * 
4 - /RESET * CRDY * ENBGT * 
4 - RESET * TR4 
KWEND. CLKF = CLK 
KWEND. RSTF - GND 
KWEND. SETF » GND 
/KWEND. TRST = OE 


/BGT 

/BGT 


/C5BGT.D /RESET * /BGT * KWEND 
4 - /RESET * /CRDY * /PBGT 
4 - /RESET * YSBGT * CRDY * /BGT 
4 - /RESET * CRDY * ENBGT /BGT 
4 - /RESET * /YSBGT * /CRDY * /ENBGT * /BGT 
4 - /RESET * /YSBGT * /ENBGT ^ C5BGT KWEND PBGT 

4 - RESET * /C8LDRV 
C5BGT.CLKF « CLK 
C5BGT,RSTF = GND 
C5BGT.SETF = GND 
/C5BGT.TRST « OE 


/BGT.D := /RESET 
4 - /RESET 
4 - /RESET 
4 - /RESET 
4 - /RESET 
4 - /RESET 
BGT. CLKF = CLK 
BGT. RSTF = GND 
BGT. SETF « GND 
/BGT. TRST = OE 


* /BGT * KWEND 

* /CRDY * /PBGT 

* YSBGT * CRDY * /BGT 

* CRDY ENBGT * /BGT 

* /YSBGT * /CRDY * /ENBGT * /BGT 

* /YSBGT * /ENBGT * C5BGT * KWEND * PBGT 
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; Declaration Segment 

TITLE EBRDY 
PATTERN A 
REVISION 2,0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 


CHIP xOl 85C224 


This PLD contains the XBRDY 


PIN 

1 

CLK 

PIN 

2 

RESET 

PIN 

3 

CLENl 

PIN 

4 

WSDTS 

PIN 

5 

YSCEOC 

PIN 

6 

SNPCYC 

PIN 

7 

MSNPSTB 

PIN 

8 

CKENLC 

PIN 

9 

MKEN 

PIN 

13 

OEx 

PIN 

15 

CKEN 

PIN 

17 

SV 

PIN 

18 

PNDCEOC 

PIN 

19 

ENBRDY 

PIN 

20 

BRDY 

PIN 

21 

BRDYl 

PIN 

22 

MSWNDO 

EQUATIONS 



XMSWNDO, and XCTRCK state machines. 
Pin Declarations 


/CKEN = CKENLC * /MKEN 
+ /CKENLC ^ /CKEN 
+ /MKEN * /CKEN 
CKEN.TRST = VCC 

/SV.D := /RESET ^ BRDY * /SV 
+ /RESET * CLENl * /SV 

+ /RESET * /YSCEOC * BRDY * ENBRDY * /PNDCEOC 
+ /RESET * /YSCEOC CLENl * ENBRDY * /PNDCEOC 
SV.CLKF “ CLK 
SV.RSTF - GND 
SV.SETF = GND 
/SV.TRST = OEx 


/PNDCEOC. D := /RESET 
+ /RESET 
+ /RESET 
+ /RESET 
+ /RESET 

PNDCEOC. CLKF = CLK 
PNDCEOC. RSTF = GND 
PNDCEOC. SETF = GND 
/PNDCEOC. TRST = OEx 


* 

'k 

k 

k 

k 


/SV 

BRDY * /PNDCEOC 

CLENl * /PNDCEOC 

/YSCEOC ^ ENBRDY * /PNDCEOC 

/YSCEOC * /ENBRDY * PNDCEOC 


/ENBRDY. D := YSCEOC * PNDCEOC 
+ /ENBRDY ^ PNDCEOC 
+ YSCEOC * /BRDY * /CLENl 
+ /YSCEOC * BRDY * /ENBRDY 
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+ /YSCEOC * CLENl ^ /ENBRDY 
+ /BRDY * /CLENl * ENBRDY * /PNDCEOC 
. + RESET 

ENBRDY. CLKF = CLK 

ENBRDY. RSTF - GND 

ENBRDY. SETF - GND 
/ENBRDY. TRST » OEx 


/BRDY.D /RESET * CLENl * /BRDY 

+ /RESET * /PNDCEOC * /WSDTS ** BRDY 
+ /RESET * /YSCEOC * /ENBRDY * /WSDTS * BRDY 

BRDY. CLKF - CLK 

BRDY. RSTF « GND 

BRDY. SETF - GND 
/BRDY. TRST « OEx 


/BRDYl.D :« /RESET * CLENl * /BRDYl 

+ /RESET * /PNDCEOC * /WSDTS * BRDYl 
+ /RESET * /YSCEOC * /ENBRDY * /WSDTS * BRDYl 

BRDYl. CLKF « CLK 

BRDYl. RSTF » GND 

BRDYl. SETF = GND 
/BRDYl. TRST » OEx 


/MSWNDO - /RESET * /SNPCYC 

+ /RESET * MSNPSTB * /MSWNDO 

MSWNDO. TRST - VCC 
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; Declaration Segment 

TITLE ECYCDEF 
PATTERN A 
REVISION 2.0 
AUTHOR I SIC SILAS 
COMPANY INTEL 
DATE 2/5/91 


CHIP xOl 85C224 

; This PLD contains the YMEMLEN state machine 

; Pin Declarations 


PIN 

1 

YMALE 

PIN 

2 

YMAOE 

PIN 

3 

MHLDA 

PIN 

4 

KCACHE 

PIN 

5 

CWR 

PIN 

6 

CMIO 

PIN 

7 

CDC 

PIN 

8 

LLEN 

PIN 

9 

C5MTHIT 

PIN 

10 

YSNPDIS 

PIN 

11 

NCI 

PIN 

13 

NC2 

PIN 

14 

YNOSWNDI 

PIN 

23 

YWRI 

PIN 

15 

YWR 

PIN 

16 

MTHIT 

PIN 

17 

MLEN 

PIN 

18 

MDC 

PIN 

19 

MMIO 

PIN 

20 

MWR 

PIN 

21 

MCACHE 

PIN 

22 

YNOSWND 

EQUATIONS 


/YWR 

= YMALE * /CWR 


+ /YMALE * /YWRI 
+ /CWR * /YWRI 
YWR.TRST = VCC 


/MTHIT = /C5MTHIT * /MWR * MHLDA 
MTHIT.TRST = MHLDA 

/MLEN = YMALE * /LLEN * /YMAOE 
+ /YMALE * /MLEN * /YMAOE 
+ /LLEN * /MLEN * /YMAOE 
MLEN.TRST = /YMAOE 

/MDC = YMALE * /CDC * /YMAOE 
+ /YMALE * /MDC * /YMAOE 
+ /CDC * /MDC /YMAOE 
MDC.TRST = /YMAOE 

/MMIO = YMALE * /CMIO * /YMAOE 
+ /YMALE * /MMIO * /YMAOE 
+ /CMIO * /MMIO * /YMAOE 
MMIO.TRST = /YMAOE 

/MWR = YMALE * /CWR * /YMAOE 
+ /YMALE * /MWR * /YMAOE 
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+ /CWR * /MWR * /YMAOE 
MWR.TRST « AMAOE 

/MCACHE = YMALE * /KCACHE * /YMAOE 
+ /YMALE * /MCACHE * /YMAOE 
+ /KCACHE * /MCACHE * /YMAOE 
MCACHE. TRST - /YMAOE 

/YNOSWND - YMALE * /YSNPDIS 
+ YMALE * /CMIO 
+ YMALE * CWR * /KCACHE 
+ /YMALE * /YNOSWNDI 
+ /YSNPDIS ^ /YNOSWNDI 
+ /CMIO * /YNOSWNDI 
+ CWR * /KCACHE * /YNOSWNDI 
YNOSWND. TRST = VCC 
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; Declaration Segment 

TITLE EMBE 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/5/91 


CHIP xOl 85C224 

; This PLD generates the memory bus byte enables (MBEs) 

; Pin Declarations 


PIN 

1 

LBEO 

PIN 

2 

LBEl 

PIN 

3 

LBE2 

PIN 

4 

LBE3 

PIN 

5 

LBE4 

PIN 

6 

LBE5 

PIN 

7 

LBE6 

PIN 

8 

LBE7 

PIN 

9 

RDYSRC 

PIN 

10 

KCACHE 

PIN 

11 

YMALE 

PIN 

13 

YMAOE 

PIN 

14 

MBE6I 

PIN 

23 

MBE7I 

PIN 

15 

MBE7 

PIN 

16 

MBE5 

PIN 

17 

MBE4 

PIN 

18 

MBE3 

PIN 

19 

MBE2 

PIN 

20 

MBEl 

PIN 

21 

MBEO 

PIN 

22 

MBE6 


EQUATIONS 

/MBE7 = /YMALE * /LBE7 * /YMAOE 

+ /YMALE * /KCACHE ’V /RDYSRC * /YMAOE 
+ YMALE * /MBE7I * /YMAOE 
+ /LBE7 * /MBE7I * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE7I * /YMAOE 

MBE7.TRST = /YMAOE 

/MBE5 = /YMALE * /LBE5 * /YMAOE 

+ /YMALE * /KCACHE /RDYSRC * /YMAOE 
+ YMALE ^ /MBE5 * /YMAOE 
+ /LBE5 ^ /MBE5 * /YMAOE 
+ /KCACHE * /RDYSRC ^ /MBE5 ^ /YMAOE 

MBE5.TRST = /YMAOE 

/MBE4 = /YMALE * /LBE4 * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 
+ YMALE * /MBE4 * /YMAOE 
+ /LBE4 * /MBE4 * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE4 * /YMAOE 

MBE4.TRST = /YMAOE 

/MBE3 = /YMALE ^ /LBE3 ■* /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 
+ YMALE * /MBE3 * /YMAOE 
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+ /LBE3 * /MBE3 * /YMAOE 
+ /KCACHE * /RDYSRG * /MBE3 * /YMAOE 

MBE3.TRST = /YMAOE 


/MBE2 = /YMALE * /LBE2 * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 
+ YMALE * /MBE2 * /YMAOE 
+ /LBE2 * /MBE2 * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE2 * /YMAOE 

MBE2.TRST - /YMAOE 


/MBEi = /YMALE * /LBEl * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 
+ YMALE * /MBEI * /YMAOE 
+ /LBEl * /MBEI * /YMAOE 
+ /KCACHE * /RDYSRC * /MBEI /YMAOE 

MBEl.TRST = /YMAOE 


/MBEO = /YMALE * /LBEO * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC 'A: /YMAOE 
+ YMALE * /MBEO * /YMAOE 
+ /LBEO * /MBEO * /YMAOE 
+ /KCACHE * /RDYSRC * /MBEO * /YMAOE 

MBEO.TRST = /YMAOE 


/MBE6 = /YMALE * /LBE6 /YMAOE 

+ /YMALE ’V /KCACHE * /RDYSRC * /YMAOE 
+ YMALE ‘A: /MBE6I * /YMAOE 
+ /LBE6 * /MBE6I * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE6I * /YMAOE 

MBE6.TRST = /YMAOE 

240957-82 


2-530 





AP-452 




iny. 


; - Declaration Segment 

TITLE EMEMALE 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 


CHIP xOl 85C224 

; This PLD contains the YMALE, YMBRDY, YWMNA, 

; and YIMSWND state machines. 

; - Pin Declarations 


PIN 

1 

MCLK 

PIN 

2 

MRESET 

PIN 

3 

YBGT 

PIN 

4 

YNOPIPE 

PIN 

5 

YPIPE 

PIN 

6 

MNA 

PIN 

7 

WMSWND 

PIN 

8 

YMEOC 

PIN 

9 

PXSAS 

PIN 

10 

YALLOC 

PIN 

11 

MSWNDI 

PIN 

13 

OE 

PIN 

14 

MBRDY 

PIN 

23 

YMADS 

PIN 

15 

YIMSWND 

PIN 

16 

YDRCTM 

PIN 

17 

NCI 

PIN 

18 

YMALE 

PIN 

19 

DISWND 

PIN 

20 

WMNA 

PIN 

21 

YMBRDY 

PIN 

22 

NC2 


EQUATIONS 

/YIMSWND = /MSWNDI * YALLOC * DISWND 
YIMSWND. TRST = VCC 

/YDRCTM.D := /MRESET * /DISWND 

+ /MRESET * YMEOC * YPIPE * /YDRCTM 
YDRCTM.CLKF = MCLK 
YDRCTM. RSTF = GND 
YDRCTM. SETF = GND 
/YDRCTM. TRST = OE 

NCl.D VCC 
NCl.CLKF = MCLK 
NCI. RSTF = GND 
NCI. SETF = GND 
/NCI. TRST = OE 


/YMALE. D := /MRESET * /YPIPE * /YMALE 
+ /MRESET * /YBGT * YMALE 
+ /MRESET * YNOPIPE * YMEOC /YMALE 
+ /MRESET * WMSWND ^ YMEOC * /YMALE 
+ /MRESET ^ /YMADS * WMNA * YMEOC * /YMALE 
+ /MRESET * MNA * WMNA * YMEOC * /YMALE 


240957-83 


2-531 




YMALfe.CLKF - MCLK 
YMALE.RSTF - GND 
YMALE.SETF - GND 
/YMALE.TRST - OE 

/DISWND.D /MRESET * /DISWND * YDRCTM 

I > + /MRESET * /PXSAS * /YALLOC * YDRCTM 

DisWNp.CLKF - MCLK 
DISWND. RSTF - GND 
DISWND. SETF - GND 
/DISWND. TRST - OE 

/WMNA.D /MRESET * /YNOPIPE * YMADS * YMEOC * /WMNA 

+ /MRESET * /YNOPIPE * YMADS * /MNA * WMNA 

+ /MRESET * /YPIPE * YMADS * /MNA * /YMEOC * WMNA 

WMNA.CLKF - MCLK 
WMNA. RSTF » GND 
WMNA. SETF = GND 
/WMNA. TRST - OE 

/YMBRDY.D :« /MRESET * /MBRDY 
YMBRDY.CLKF » MCLK 
YMBRDY.RSTF - GND 
YMBRDY.SETF - GND 
/YMBRDY.TRST » OE 

NC2 « VGC 
NC2.TRST » VCC 
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TITLE 

EMSNPST 


PATTERN 

A 


REVISION 

2.0 


AUTHOR 


ISIC SILAS 


COMPANY 

INTEL 


DATE 


2/4/91 


CHIP 

xOl 

85C224 



This PLD contains the YALLC, YMEMLOCK, YSNPSTB, 



and YMBREQ state machines. 




r,._ __ 






PIN 

1 

MCLK 


PIN 

2 

MRESET 


PIN 

3 

MKEN 


PIN 

4 

MHLDA 


PIN 

5 

YWR 


PIN 

6 

YNOSWND 


PIN 

7 

YBGT 


PIN 

8 

XLRDYSRC 


PIN 

9 

RFO 


PIN 

10 

SNPDIS 


PIN 

11 

PALLC 


PIN 

13 

KLOCK 


PIN 

23 

PXSAS 


PIN 

16 

YMLOCK 


PIN 

17 

SV2 


PIN 

18 

HBASWB 


PIN 

19 

MBREQ 


PIN 

20 

MSNPSTB 


PIN 

21 

SVl 


PIN 

22 

YALLOC 


EQUATIONS 



/YMLOCK.D 

:= /MRESET * /YALLOC * YMLOCK 




+ /MRESET * /HBASWB * YMLOCK 




+ /MRESET * YBGT * /YALLOC /HBASWB 




+ /MRESET * /KLOCK * /YALLOC * /HBASWB 




+ /MRESET * /YBGT /KLOCK YMLOCK 


YMLOCK 

.CLKF = MCLK 


YMLOCK 

RSTF = GND 


YMLOCK 

.SETF = GND 


YMLOCK. TRST = VCC 


/SV2 . D 

; = 

PXSAS * /SV2 



+ 

PXSAS * /MHLDA * /HBASWB 


SV2.CLKF = MCLK 


SV2.RSTF = GND 


SV2.SETF » GND 


SV2.TRST « VCC 


/HBASWB . D 

:= /MRESET * PXSAS * /HBASWB * SV2 




+ /MRESET * PXSAS * /YBGT * MBREQ * SV2 


HBASWB 

CLKF = MCLK 


HBASWB 

RSTF = GND 


HBASWB 

SETF = GND 


HBASWB 

TRST = VCC 


/MBREQ . D 

= PXSAS * /MBREQ 




+ PXSAS * HBASWB /SV2 
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+ /PXSAS * /YBGT * MBREQ * HBASWB * SV2 
+ MRESET 

MBREQ. CLKF = MCLK 
MBREQ. RSTF = GND 
MBREQ. SETF - GND 
MBREQ. TRST = VCC 

/MSNPSTB.D /MRESET * /YBGT * YNOSWND * YWR * MSNPSTB 

+ /MRESET * /YBGT * YNOSWND * XLRDYSRC * MSNPSTB 

+ /MRESET * /YBGT * YNOSWND * RFO ^ MSNPSTB 

MSNPSTB. CLKF « MCLK 
MSNPSTB. RSTF « GND 
MSNPSTB* SETF = GND 
MSNPSTB. TRST » VCC 

/SVl.D : 

SVl.CLKF 
SVl.RSTF 
SVl.SETF 
SVl.TRST 

/YALLOC.D /MRESET * PXSAS * /YALLOC * /SVl 

+ /MRESET * /MKEN * /YALLOC * sVl 

+ /MRESET * /YBGT * /PALLC ^ SNPDIS * /RFO * YALLOC 

YALLOC. CLKF = MCLK 
YALLOC. RSTF « GND 
YALLOC. SETF - GND 
YALLOC. TRST - VCC 

240957-86 
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; Declaration Segment 

TITLE EMZBT 
PATTERN A 
REVISION 3.1 

AUTHOR ISIC SILAS + Andy Bloom 
COMPANY INTEL 
DATE 2/7/91 

CHIP xOl 85C224 

; This PLD contains the YMBRDY state machine. 

; Pin Declarations 


PIN 

1 

MCLK 

PIN 

2 

MRESET 

PIN 

3 

MAOE 

PIN 

4 

MHLDA 

PIN 

5 

YNOPIPE 

PIN 

6 

YPIPE 

PIN 

7 

MCACHE 

PIN 

8 

YMEOC 

PIN 

9 

MEMZBTEN 

PIN 

10 

SYNC 

PIN 

11 

MALDRV 

PIN 

13 

FLUSH 

PIN 

14 

NCPFLD 

PIN 

15 

FPFLDEN 

PIN 

23 

NC4 

PIN 

16 

NCI 

PIN 

17 

NC2 

PIN 

18 

NC3 

PIN 

19 

YMZBT 

PIN 

20 

FPFLD 

PIN 

21 

YFLUSH 

PIN 

22 

YSYNC 


EQUATIONS 

NCI = VCC 
NCl.TRST = VCC 

NC2 - VCC 
NC2.TRST = VCC 

NC3 « VCC 
NC3.TRST = VCC 

/YMZBT.D := /MRESET * 
+ /MRESET * 
+ /MRESET * 
+ /MRESET ^ 
+ /MRESET * 
+ /MRESET * 
+ MRESET 

YMZBT.CLKF = MCLK 
YMZBT.RSTF = GND 
YMZBT.SETF = GND 
YMZBT.TRST = VCC 


YPIPE * YMEOC * /YMZBT 

/MCACHE * YMEOC /YMZBT 

YNOPIPE * /YPIPE /MCACHE /MEMZBTEN 

YNOPIPE * /YPIPE * /MCACHE * /YMZBT 

MHLDA * /MAOE * /MEMZBTEN * YMZBT 

/YNOPIPE * /MCACHE * /MEMZBTEN * YMZBT 
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/FPFLD = FPFLDEN * MRESET 

FPFLD.TRST « MRESET 


/YFLUSH - MRESET * /NCPFLD 


+ /MRESET * /FLUSH 


YFLUSH. TRST - VCC 


/YSYNC - MRESET * /MALDRV 


+ /MRESET * /SYNC 


YSYNC. TRST - VCC 
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; Declaration Segment 

TITLE ESIGGEN 

PATTERN 

REVISION 1.0 

AUTHOR 

COMPANY INTEL 

DATE 

CHIP INTEL 85C224 

; This PLD drives memory bus and core signals based on the states 

; of other state machines 

; Pin Declarations 

PIN 1 YDRCTM 

PIN 2 YMADS 

PIN 3 YMAOE 

PIN 4 MHLDA 

PIN 5 NCI 

PIN 6 MWBWT 

PIN 7 MDRCTM 

PIN 8 SNPDIS 

PIN 9 UNI 

PIN 10 YMSEL 

PIN 11 TR4 

PIN 13 YMFRZ 

PIN 14 MDLDRV 

PIN 23 LMRST 

PIN 15 C8MSEL 

PIN 16 NC2 

PIN 17 CDRCTM 

PIN 18 CWBWT 

PIN 19 MBOFF 

PIN 20 MADS 

PIN 21 YSNPDIS 

PIN 22 C8MFRZ 

EQUATIONS 

/C8MSEL = LMRST * /TR4 

+ /LMRST * /YMSEL 
C8MSEL.TRST = VCC 

NC2 = VCC 
NC2.TRST = VCC 

CDRCTM = MDRCTM * YDRCTM 
CDRCTM. TRST = VCC 

/CWBWT = /MWBWT * YDRCTM 
CWBWT. TRST = VCC 

/MBOFF - YMAOE * /MHLDA 
MBOFF. TRST “ /MHLDA 

/MADS - /YMADS * /YMAOE 
MADS. TRST = /YMAOE 

YSNPDIS = SNPDIS * UNI 
YSNPDIS. TRST = VCC 

/C8MFRZ = LMRST * /MDLDRV 
+ /LMRST * /YMFRZ 
C8MFRZ.TRST = VCC 
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; Declaration Segment 

TITLE ESWND 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 


CHIP xOl 85C224 


This PLD contains the XCRDY, XSWND, and XENSWND state machines. 
Pin Declarations 


PIN 

1 

CLK 

PIN 

2 

RESET 

PIN 

3 

WSDTS 

PIN 

4 

BGT 

PIN 

5 

PBGT 

PIN 

6 

TR4 

PIN 

7 

YSMSWND 

PIN 

8 

SNPDIS 

PIN 

9 

YSMEOC 

PIN 

10 

SLFTST 

PIN 

13 

OEx 

PIN 

16 

ENSWND 

PIN 

17 

SV3 

PIN 

18 

SWEND 

PIN 

19 

SV2 

PIN 

20 

SVl 

PIN 

21 

CRDY 

PIN 

22 

CRDYl 

EQUATIONS 



ENSWND.D /RESET * ASMSWND 
ENSWND.CLKF « CLK 
ENSWND.RSTF - GND 
ENSWND.SETF - GND 
/ENSWND.TRST » OEx 

/SV3.D := RESET * TR4 

+ /RESET * CRDY * SWEND * /SV3 

+ /RESET * /PBGT * /ENSWND * /YSMSWND * CRDY * SV3 
+ /RESET * /PBGT * CRDY * /SNPDIS * /SWEND * SV3 
+ /RESET * /PBGT * /ENSWND * /YSMSWND * SWEND * SV3 
SV3.CLKF -= CLK 
SV3.RSTF = GND 
SV3.SETF - GND 
/SV3.TRST = OEx 


/SWEND. D :«= RESET * TR4 


+ 

+ 

+ 

+ 

+ 

+ 

+ 

SWEND. CLKF = 
SWEND. RSTF > 
SWEND. SETF - 


/RESET 
/RESET 
/RESET 
/RESET 
/RESET 
/RESET 
/RESET 
= CLK 
» GND 
> GND 


* /CRDY * SWEND * /SV3 

* PBGT * CRDY ^ /SWEND * SV3 

* /BGT * /SNPDIS * SWEND * SV3 

* ENSWND * CRDY * SNPDIS * /SWEND * SV3 

* YSMSWND * CRDY * SNPDIS * /SWEND * SV3 

* /BGT * /ENSWND * /YSMSWND * SWEND * SV3 

* /PBGT * /ENSWND * /YSMSWND * /CRDY * /SWEND * 


SV3 
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/SWEND.TRST = OEx 


/SV2.D := RESET * /SLFTST 

+ /RESET * /YSMEOC * CRDY * /SV2 
+ /RESET * /YSMEOC * /CRDY * SV2 

SV2.CLKF = CLK 

SV2.RSTF = GND 

SV2.SETF = GND 
/SV2.TRST » OEx 


/SVl.D := /RESET /YSMEOC ’V CRDY 

+ /RESET * CRDY /SVl * SV2 

SVl.CLKF - CLK 
, SVl.RSTF = GND 

SVl.SETF = GND 
/SVl.TRST = OEx 


/CRDY.D := RESET * /SLFTST 

+ /RESET * /YSMEOC * /WSDTS * CRDY * SV2 
+ /RESET * /WSDTS * CRDY * /SVl * SV2 

CRDY.CLKF = CLK 

CRDY.RSTF = GND 

CRDY.SETF = GND 
/CRDY.TRST = OEx 


/CRDYl.D RESET /SLFTST 

+ /RESET * /YSMEOC * /WSDTS * CRDYl * SV2 
+ /RESET * /WSDTS * CRDYl * /SVl * SV2 

CRDYl. CLKF = CLK 

CRDYl. RSTF = GND 

CRDYl. SETF = GND 
/CRDYl. TRST = OEx 
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; Declaration Segment 

TITLE EWCPLB 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 


CHIP xOl 85G224 


This PLD contains the XWCPLB and YCPULEN state machines. 


PIN 

1 

CLK 

PIN 

2 

RESET 

PIN 

3 

CRDY 

PIN 

4 

RDYSRC 

PIN 

5 

BGT 

PIN 

6 

PBGT 

PIN 

7 

KCACHE 

PIN 

8 

LEN 

PIN 

9 

CACHE 

PIN 

10 

CKEN 

PIN 

11 

BRDY 

PIN 

13 

OEx 

PIN 

16 

CLEN4 

PIN 

17 

CLEN2 

PIN 

18 

CLENl 

PIN 

19 

LKCACHE 

PIN 

20 

SV 

PIN 

21 

CPUEN 

PIN 

22 

WCPLB 

EQUATIONS 


/CLEN4 

.D : 

= CPUEN * , 


-Pin Declaration- 


/BRDY 
+ BRDY * 
+ /CACHE 
+ RESET 

CLEN4.CLKF = CLK 
CLEN4.RSTF = GND 
CLEN4.SETF = GND 
/CLEN4.TRST = OEx 


JJ2 * /CLEN4 
- /CLENl 
CLEN2 * /CLEN4 

* /CKEN * /LKCACHE * /CLEN2 * /CLEN4 


/CLEN2.D := CPUEN /CLEN2 * /CLEN4 
+ BRDY * /CLEN2 * CLEN4 
+ /BRDY ^ CLEN2 CLEN4 
+ LEN * CACHE * /CLEN2 ^ /CLEN4 
+ LEN * CKEN, * /CLEN2 /CLEN4 
+ LEN * LKCACHE * /CLEN2 * /CLEN4 
+ RESET 

CLEN2.C1JCF = CLK 
CLEN2.RSTF = GND 
CLEN2.SETF = GND 
/CLEN2.TRST = OEx 


/CLENl. D := /RESET * BRDY * /CLENl 

+ /RESET * /BRDY * /CLEN2 * CLEN4 

+ /RESET * /CPUEN * /LEN * CACHE * /CLEN2 * /CLEN4 

+ /RESET * /CPUEN * /LEN * CKEN * /CLEN2 * /CLEN4 

+ /RESET * /CPUEN * /LEN * LKCACHE * /CLEN2 * /CLEN4 

CLENl. CLKF = CLK 
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CLENl.RSTF = GND 
CLENl.SETF - GND 
/CLENl.TRST = OEx 

/LKCACHE.D := /KCACHE 
LKCACHE.CLKF = CLK 
LKCACHE.RSTF = GND 
LKCACHE.SETF = GND 
/LKCACHE . TRST = OEx 

/SV.D := /RESET * CRDY ^ /SV 

+ /RESET * /RDYSRC * /BGT * CPUEN * SV 
+ /RESET CRDY * /BRDY * /CLENl WCPLB /CPUEN 
SV.CLKF = CLK 
SV.RSTF = GND 
SV.SETF = GND 
/SV.TRST = OEx 

/CPUEN. D := /RESET BRDY ‘A- /CPUEN 
+ /RESET CLENl * /CPUEN 
+ /RESET * RDYSRC '-v /bgT * /WCPLB 
+ /RESET * RDYSRC /BGT * CPUEN * SV 
CPUEN. CLKF = CLK 
CPUEN. RSTF = GND 
CPUEN. SETF = GND 
/CPUEN. TRST = OEx 

/WCPLB. D := /RESET BRDY * /WCPLB 
+ /RESET CLENl * /WCPLB 
+ /RESET * /CRDY * BRDY * /CPUEN 
+ /RESET ‘-t /CRDY * CLENl * /CPUEN 
WCPLB. CLKF = CLK 
WCPLB. RSTF = GND 
WCPLB. SETF = GND 
/WCPLB. TRST = OEx 
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; Declaration Segment 

TITLE EWMSWND 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 


CHIP xOl 85C224 

; This PLD contains the YENMSWND, YWMSWND,.and YENXSAS state machines. 


Pin Declarations 


PIN 

1 

MCLK 

PIN 

2 

MRESET 

PIN 

3 

XSAS 

PIN 

4 

XSNPWB 

PIN 

5 

YPIPE 

PIN 

6 

YNOPIPE 

PIN 

7 

YMEOC 

PIN 

8 

MHITMI 

PIN 

9 

YMSWEND 

PIN 

10 

YNOSWND 

PIN 

11 

YBGT 

PIN 

13 

OEx 

PIN 

14 

YALLOC 

PIN 

23 

PCTCXFR 

PIN 

15 

UNUSED 

PIN 

16 

PSWBAS 

PIN 

17 

SV 

PIN 

18 

WMSWND 

PIN 

19 

ENMSWND 

PIN 

20 

ENXSAS 

PIN 

21 

PXSAS 

PIN 

22 

YSWEHITM 

EQUATIONS 



UNUSED = VCC 
UNUSED. TRST = VCC 


/PSWBAS = /XSAS * /XSNPWB * /ENXSAS 
PSWBAS.TRST = VCC 


/SV.D := YMEOC * /WMSWND * /SV 

+ /YNOSWND * /YPIPE •* YMEOC * /WMSWND 
+ /YPIPE * /YMSWEND * /ENMSWND * YMEOC * /WMSWND 
SV.CLKF = MCLK 
SV.RSTF = GND 
SV.SETF = GND 
/SV.TRST = OEx 


/WMSWND. D := /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
WMSWND. CLKF = MCLK 
WMSWND. RSTF = GND 


* YMEOC * /WMSWND 

* /WMSWND * /SV 

* /YNOSWND * /YPIPE * /WMSWND 

* /YNOSWND * /YBGT * WMSWND 

* /YPIPE * /YMSWEND /ENMSWND * /WMSWND 

* YPIPE /PCTCXFR * /YALLOC * /WMSWND 

'-V /YMSWEND * /ENMSWND A^LLOC * WMSWND 

* YNOSWND * /YNOPIPE * /YMSWEND * /ENMSWND * WMSWND 
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WMSWND.SETF - GND 
/WMSWND.TRST = OEx 

/ENMSWND.D := YMSWEND 
ENMSWND.CLKF = MCLK 
ENMSWND.RSTF = GND 
ENMSWND.SETF - GND 
/ENMSWND.TRST = OEx 

/ENXSAS.D := YBGT * /ENXSAS 
+ XSAS * ENXSAS 
+ MRESET 

ENXSAS. CLKF = MCLK 
ENXSAS. RSTF = GND 
ENXSAS.SETF = GND 
/ENXSAS. TRST = OEx 

/PXSAS = /XSAS * XSNPWB * /ENXSAS 
PXSAS.TRST = VCC 


/YSWEHITM = /YMSWEND * /ENMSWND * /MHITMI * YALLOC 

+ /YMSWEND ^ /ENMSWND * /MHITMI * YNOPIPE * YPIPE 
YSWEHITM. TRST = VCC 
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Appendix C Schematic: i860TM XP CPU 
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Appendix C Schematic: 82495XP Cache Controller 
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Appendix C Schematic: 82490XP Cache RAM 
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Appendix C Schematic: 82490XP Cache RAM 
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Appendix C Schematic: 82490XP Cache RAM 
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Appendix C Schematic: 82490XP Cache RAM 
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Appendix C Schematic: Clock Generator 
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Appendix C Schematic 
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Appendix C Schematic 
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80960SA/80960SB 
EMBEDDED 32-BIT PROCESSORS 
WITH 16-BIT BURST DATA BUS 


■ High-Performance Embedded 
Architecture 

— 16 MIPS Burst Execution at 16 MHz 
— 5 MIPS* Sustained Execution at 
16 MHz 

■ 512-Byte On-Chip Instruction Cache 
— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

■ Multiple Register Sets 

— Sixteen Global 32-Bit Registers 
— Sixteen Local 32-Bit Registers 
— Four Local Register Sets Stored 
On-Chip 

— Register Scoreboarding 

m Software Compatible with 
80960KA/KB/CA Processors 


■ Built-In Interrupt Controller 

— 4 Direct Interrupt Pins 

— 32 Priority Levels 256 Vectors 

■ Built-In Floating Point Unit 
(80960SB only) 

— Fully IEEE 754 Compatible 

■ Easy to Use, High Bandwidth 16-Bit Bus 

— 25.6 Mbyte/sec Burst 

— Up to 16 Bytes Transferred per Burst 

■ 32-Bit Address Space, 4 Gigabytes 

■ 80-Lead Quad Flat Pack (EIAJ QFP) 

■ 84-Lead Plastic Leaded Chip Carrier 
(PLCC) 


The 80960SA and 80960SB are members of Intel’s i960 32-bit processor family, which are designed especially 
for low cost embedded applications. They are based on the family’s high performance, common core architec- 
ture, and include a 512-byte instruction cache and a built-in interrupt controller. The 80960SA and 80960SB 
have a large register set, multiple parallel execution units and a high bandwidth, 16-blt, burst bus. Using 
advanced RISC technology, these high performance processors are capable of execution rates in excess of 
5 million instructions per second.* The 80960SA and 80960SB are well-suited for a wide range of cost 
sensitive embedded applications such as laser printers, EISA and MCA adapters, disk controllers and X 
Terminals. 

*Relatlve to Digital Equipment Corporation’s VAX-1 1/780** at MIPS 


80960SB 
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** VAX-1 1TM is a trademark of Digital Equipment Corporation. 
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THE i 960 TM PROCESSOR SERIES 

The 80960SA and 80960SB are members of a new 
family of 32-bit microprocessors from Intel known as 
the i960 Series. This series was especially designed 
to serve the needs of embedded applications. The 
embedded market includes applications as diverse 
as industrial automation, avionics, image processing, 
graphics, robotics, telecommunications and automo- 
biles. These types of applications require high inte- 
gration, low power consumption, quick Interrupt re- 
sponse times and high performance. Since time to 
market is critical, embedded microprocessors need 
to be easy to use in both hardware and software 
designs. 


All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor in the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications for the embed- 
ded market. For example, future processors may In- 
clude a DMA controller, a timer or an A/D converter. 

Software written for the 80960SA and 80960SB will 
run without modification on any other member of the 
80960 family. The 80960SA is pin compatible with 
the 80960SB, which includes an integrated floating- 
point unit. 


go 


g15 

fpO 

fp3 

rO 


r15 


SIXTEEN 

32-BIT 

REGISTERS 


GLOBAL 

REGISTERS(1,4) 


FOUR 80-BIT REGISTERS 


FLOATING- 

POINT 

REGISTERS(2) 


SIXTEEN 

32-BIT 

REGISTERS 


LOCAL 

REGISTERS(3) 


32-BITS 


32-BITS I ARITHMETIC CONTROLS 
INSTRUCTION POINTER 
PROCESS CONTROLS 
TRACE CONTROLS 


32-BITS 


32-BITS 


232 . ^ 



ADDRESS 

SPACE 
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NOTES: 

1. Register g15 Is reserved for stack management functions. 

2. Floating-Point registers and operations are available only in the 960SB and 960KB processors. 

3. Registers rO, r1 and r2 are reserved for stack management functions. 

4. Register g14 is used by BAL and BALX instructions. 


Figure 2. 80960 Register Set 
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Key Performance Features 

The 80960SA and 80960SB's architecture is based 
on the most recent advances in RISC technology 
and is grounded in Intel’s long experience in design- 
ing embedded controllers. Many features contribute 
to the 80960SA and 80960SB exceptional perform- 
ance. 

1. Large Register Set. Modern compilers can take 
advantage of a large number of registers to optimize 
execution speed. For maximum flexibility, the 
80960SA and 80960SB provide 32 32-bit registers 
and four 80-bit floating-point registers. (See 
Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions in most programs, 
so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions). 

3. Load/Store Architecture. Like other processors 
based on RISC technology, the 80960SA and 


80960SB has a Load/Store architecture. Only the 
LOAD and STORE instructions reference memory; 
all other instructions operate on registers. This type 
of architecture simplifies instruction decoding and is 
used in combination with other techniques to in- 
crease parallelism. 

4. Simple Instruction Formats. All instructions in 
the 80960SA and 80960SB are 32 bits long and 
must be aligned on word boundaries. This alignment 
makes it possible to eliminate the instruction-align- 
ment stage in the pipeline. To simplify the instruction 
decoder further, there are only five Instruction for- 
mats and each Instruction type uses only one for- 
mat. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent Instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960SA and 80960SB manage this pro- 
cess transparently to software through the use of a 
register scoreboard. Conditional instructions also 
make use of a scoreboard so that subsequent unre- 
lated instructions can be executed while the condi- 
tional Instruction is pending. 
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Reg/LIt 
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Reg 
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Figure 3. Instruction Formats 
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Table 1. 80960SA and 80960SB Instruction Set 


Data Movement 

Arithmetic 

Logical 

Bit and Bit Field 

Load 

Add 

And 

Set Bit 

Store 

Subtract 

Not And 

Clear Bit 

Move 

Multiply 

And Not 

Not Bit 

Load Address 

Divide 

Or 

Check Bit 


Remainder 

Exclusive Or 

Alter Bit 


Modulo 

Not Or 

Scan for Bit 


Shift 

Or Not 

Scan over Bit 


Extended Multiply 

Nor 

Extract 


Extended Divide 

Exclusive Nor 

Not 

Nand 

Rotate 

Modify 

Comparison 

Branch 

Call/Return 

Fault 

Compare 

Unconditional Branch 

Call 

Conditional Fault 

Conditional Compare 

Conditional Branch 

Call Extended 

Synchronize Faults 

Compare and 

Compare and Branch 

Call System 


Increment 


Return 


Compare and 


Branch and Link 


Decrement 




Debug 

Miscellaneous 

Decimal 


Modify Trace Controls 

Atomic Add 

Move 


Mark 

Atomic Modify 

Add with Carry 


Force Mark 

Flush Local Registers 
Modify Arithmetic 

Subtract with Carry 



Controls 




Scan Byte for Equal 

Test Condition Code 



Conversion 
(80960SB only) 

Floating-Point 
(80960SB only) 

Synchronous 


Convert Real to Integer 

Move Real 

Synchronous Load 


Convert Integer to Real 

Add 

Subtract 

Multiply 

Divide 

Remainder 

Scale 

Round 

Square Root 

Sine 

Cosine 

Tangent 

Arctangent 

Log 

Log Binary 

Log Natural 

Exponent 

Classify 

Copy Real Extended 
Compare 

Synchronous Move 
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6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960SA and 
80960SB get optimal use of their memory bus band- 
width because the bus Is tuned for use with the 
cache; the line size of the instruction cache matches 
the maximum burst size for instruction fetches. The 
80960SA and 80960SB automatically fetch four 
words in a burst and store them directly In the 
cache. Due to the size of the cache and the fact that 
it Is continually filled in anticipation of needed in- 
structions In the program flow, the 80960SA and 
80960SB are exceptionally insensitive to memory 
wait states. In fact, each wait state causes only a 
10% degradation in system performance. The bene- 
fit is that the 80960SA and 80960SB will deliver out- 
standing performance even with a low cost memory 
system. 

8. Cache Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 


Memory Space and Addressing Modes 

The 80960SA and 80960SB offer a linear program- 
ming environment so that all programs running on 
the processors are contained in a single address 
space. The maximum size of the address space is 
4 Gigabytes. 

For ease of use, the 80960SA and 80960SB have a 
small number of addressing modes, but include all 
those necessary to ensure efficient compiler imple- 
mentations of high-level languages such as C, For- 
tran and Ada. Table 2 lists the memory addressing 
modes. 


Data Types 

The 80960SA and 8096086 recognize the following 
data types: 

Numeric: 

• 8-, 1 6-, 32- and 64-bit ordinals 

• 8-, 1 6-, 32- and 64-bit integers 

• 8-, 16-, 32-, 64- and 80-bit reals 


Non-Numeric: 

• bit 

• bit Field 

• Triple-Word (96 bits) 

• Quad-Word (128 bits) 


Large Register Set 


The following environment of the 80960SA and 
80960SB include a large number of registers. In fact, 
32 registers are available at any time. The availability 
of this many registers greatly reduces the number of 
memory accesses required to execute most pro- 
grams, which leads to greater instruction processing 
speed. 



There are two types of general-purpose registers: 
local and global. The global registers consist of six- 
teen 32-blt registers (GO through G15). These regis- 
ters perform the same function as the general-pur- 
pose registers provided In other popular microproc- 
essors. The term global refers to the fact that these 
registers retain their contents across procedure 
calls. 


The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960SA 
and 80960SB allocate 1 6 local registers (RO through 
R15). Each local register is 32 bits wide. 


Multiple Register Sets 

To further Increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 
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Table 2. Memory Addressing Modes 

• 12-Bit Offset 

• 32-Bit Offset 

• Register-Indirect 

• Register + 1 2-Bit Offset 

• Register + 32-Bit Offset 

• Register + (Index-Register x Scale-Factor) 

• Register x Scale Factor + 32-Bit Displacement 

• Register 4- (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor Is 1 , 2, 4, 8 or 1 6 


Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 


oldest set of local registers in the register cache to a 
procedure stack in memory to make room for a new 
set of registers. Global register G15 is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global registers are not exchanged on 
a procedure call, but retain their contents, making 
them available to all procedures for fast parameter 
passing. An illustration of the register cache Is 
shown in Figure 4. 
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Instruction Cache 

To further reduce memory accesses, the 80960SA 
and 80960SB include a 512-byte on-chip instruction 
cache. The instruction cache is based on the con- 
cept of locality of reference; that is, most programs 
are not usually executed in a steady stream but con- 
sist of many branches and loops that lead to jumping 
back and forth within the same small section of 
code. Thus, by maintaining a block of Instructions in 
a cache, the number of memory references required 
to read instructions into the processor can be greatly 
reduced. 

To load the instruction cache, instructions are 
fetched in 1 6-byte blocks, so that up to four instruc- 
tions can be fetched at one time. 

Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure’s return. 


Register Scoreboarding 

The instruction decoder has been optimized In sev- 
eral ways. One of these optimizations is the ability to 
do Instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction Is Initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to Insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, It can go on to execute 
additional instructions placed in between the LOAD 


Instruction and the instruction that uses the register 
contents, as shown in the following example: 

LOAD address 1 , R4 
LOAD address 2, R5 
Unrelated Instruction 
Unrelated instruction 
ADD R4, R5, R6 

In essence, the two unrelated instructions between 
the LOAD and ADD instructions are executed for 
free (I.e., take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three instructions can be pending at one time 
with three corresponding scoreboard bits set. By ex- 
ploiting this feature, system programmers and com- 
pilers have a useful tool for optimizing execution 
speed. 


Floating-Point Arithmetic 

In the 80960SB, floating-point arithmetic has been 
made an Integral part of the architecture. Having the 
floating-point unit Integrated on-chip provides two 
advantages. First, It Improves the performance of 
the chip for floating-point applications, since no ad- 
ditional bus overhead is associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations Is reduced because a 
separate coprocessor chip is not required. 



The 80960SB floating-point (real number) data types 
include single-precision (32-blt), double-precision 
(64-blt) and extended precision (80-bit) floating-point 
numbers. Any register may be used to execute float- 
ing-point operations. 


The processor provides hardware support for both 
mandatory and recommended portions of IEEE 
Standard 754 for floating-point arithmetic, exponen- 
tial, logarithmic and other transcendental functions. 
Table 3 shows execution times for some representa- 
tive instructions. 
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• 1 6-bit data path multiplexed onto the lower bits of 
the 32-bit address path 

• Eight 16-bit half-word burst capacity, which al- 
lows transfers from 1 to 1 6 bytes at a time 

• High bandwidth reads and writes at 25.6 Mbytes 
per second 

Figure 5 identifies the groups of signals which con- 
stitute the Bus. Table 4 lists the function of the Bus 
and other processor-support signals, such as the in- 
terrupt lines. 


Interrupt Handling 

The 80960SA and 80960SB can be interrupted in 
one of two ways: by the activation of one of four 
interrupt pins or by sending a message on the proc- 
essor’s data bus. 

The 80960SA and 80960SB are unusual In that they 
High Bandwidth Bus automatically handle interrupts on a priority basis 

^ and track pending interrupts through their on-chip 

The 80960SA and 80960SB CPUs reside on a high- interrupt controller. Two of the interrupt pins can be 
bandwidth address/data bus. The bus provides a dl- configured to provide 8259A handshaking for expan- 
rect communication path between the processor sion beyond four Interrupt lines, 
and the memory and I/O subsystem Interfaces. The 
processor uses the bus to fetch instructions, manip- 
ulate memory and respond to interrupts. Its features 
include; 



Figure 5. 80960SA and 80960SB Bus Signal Groups 


Table 3. Sample Floating-Point 
Execution Times (jms) at 16 MHz 



32-Blt 

64-Blt 

Add 

0.6 

0.8 

Subtract 

0.6 

0.8 

Multiply 

1.1 

2.0 

Divide 

2.0 

4.5 




Square Root 

5.8 

6.1 

Arctangent 

15.8 

20.5 

Exponent 

17.7 

19.5 

Sine 

23.8 

25.9 

Cosine 

23.8 

25.9 
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Debug Features 

The 80960SA and 80960SB have built-in debug ca- 
pabilities. There are two types of breakpoints and six 
different trace modes. The debug features are con- 
trolled by two internal 32-bit registers, the Process- 
Controls Word and the Trace-Controls Word. By set- 
ting bits in these control words, a software debug 
monitor can closely control how the processor re- 
sponds during program execution. 


application and are often included as part of the op- 
erating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific Information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 


The 80960SA and 80960SB have both hardware 
and software breakpoints. They provide two hard- 
ware breakpoint registers on-chip which can be set 
by a special command to any value. When the in- 
struction pointer matches the value in one of the 
breakpoint registers, the breakpoint will fire, and a 
breakpoint handling routine is called automatically. 

Tracing is available* for all Instructions (single-step 
execution), calls and returns and branching. Each 
different type of trace may be enabled separately by 
a special debug instruction. In each case, the 
80960SA and 80960SB execute the instruction first 
and then call a trace handling routine (usually part of 
a software debug monitor). Further program execu- 
tion is halted until the trace routine is completed. 
When the trace event handling routine is completed, 
instruction execution resumes at the next instruc- 
tion. The 80960SA and 80960SB’s tracing mecha- 
nisms, which are implemented completely In hard- 
ware, greatly simplify the task of testing and debug- 
ging software. 


BUILT-IN TESTABILITY 


Upon reset, the 80960SA and 80960SB automatical- 
ly conducts an extensive internal test (self-test) of Its 
major blocks of logic. Then, before executing its first 
instruction, it does a zero check sum on the first 
eight words in memory to ensure that the system 
has been loaded correctly. If a problem Is discov- 
ered at any point during the self-test, the 80960SA 
and 80960SB will Indicate a failure and will not begin 
program execution. The self-test takes approximate- 
ly 47,000 cycles to complete, and can be disabled. 

System manufacturers can use the 80960SA and 
80960SB’s self-test feature during incoming parts in- 
spection. No special diagnostic programs need to be 
written, and the test Is both thorough and fast. The 
self-test capability helps ensure that defective parts 
will be discovered before systems are shipped, and 
once in the field, the self-test makes it easier to dis- 
tinguish between problems caused by processor fail- 
ure and problems resulting from other causes. 



FAULT DETECTION 

The 80960SA and 80960SB have an automatic 
mechanism to handle faults. There are ten fault 
types including trace, arithmetic, and floating-point 
faults. When the processor detects a fault, it auto- 
matically calls the appropriate fault handling routine 
and saves the current instruction pointer and neces- 
sary state information to make efficient recovery 
possible. The processor posts diagnostic Informa- 
tion on the type of fault to a Fault Record. Like inter- 
rupt handling routines, fault handing routines are 
usually written to meet the needs of a specific 


CHMOS 

The 80960SA and 80960SB are fabricated using In- 
tel’s CHMOS IV (Complementary High Speed Metal 
Oxide Semiconductor) process. This advanced tech- 
nology eliminates the frequency and reliability limita- 
tions of older CMOS processes and opens a new 
era in microprocessor performance. It combines the 
high performance capabilities of Intel’s industry- 
leading HMOS technology with the high density and 
low power characteristics of CMOS. The 80960SA 
and 80960SB are available at 10 MHz in both PLCC 
and QFP packages, and at 16 MHz in the PLCC 
package. 
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Table 4. 80960SA and 80960SB Pin Description: Bus Signals 


Symbol 

Type 

Name and Function 

CLK2 

I 

SYSTEM CLOCK provides the fundamental timing for 80960SA and 80960SB 
systems. CLK2 is divided by two inside the 80960SA and 80960SB to generate the 
internal processor clock. 

A31-A16 

0 

T.S. 

ADDRESS BUS carries the upper 1 6 bits of the 32-blt address to memory. It Is valid 
throughout the burst cycle, no latch is required. 

AD15-AD1,D0 

I/O 

T.S. 

ADDRESS/DATA BUS carries the low order 32-bit addresses and 16-blt data to and 
from memory. AD15-AD4 must be latched since the cycle following the address 
cycle carries data on the bus. 

A3-A1 

0 

T.S. 

ADDRESS BUS carries the word addresses of the 32-bit address to memory. These 
three bits are incremented during a burst access indicating the next word address of 
the burst access. Note that A3-A1 are duplicated with AD3-AD1 during the address 
cycle. 

ALE 

0 

T.S. 

ADDRESS LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a Ta cycle and deasserted before the beginning of the following Td 
state. It is active high and floats to a high impedance state during a hold cycle (Th or 
Thr). 

M 

0 

T.S. 

ADDRESS STATUS indicates an address state. AS is asserted every Ta state and 
deasserted during the following Td state. AS is driven HIGH during reset. 

W/R 

0 

T.S. 

WRITE/READ specifies, during a Ta cycle, whether the operation is write or read. It 
is latched on-chip and remains valid during Td cycles. 

DEN 

0 

T.S. 

DATA ENABLE is asserted during Td cycles and indicates transfer of data on the AD 
lines. The AD lines should not be driven by an external source unless DEN is 
asserted. When DEN is asserted, the outputs from the previous cycle are guaranteed 
to be 3-stated. In addition, DEN deasserted indicates inputs have been captured and 
therefore input hold times can be disregarded. DEN is driven to a HIGH during reset. 

READY 

I 

READY indicates that data on AD lines can be sampled or removed. If READY is not 
asserted during a Td cycle the Td cycle is extended to the next cycle by inserting a 
wait state (Tw). 

DT/R 

0 

T.S. 

DATA TRANSMIT/RECEIVE indicates the direction of the data transfer to and from 
the bus. It is low during Ta and Td cycles for a r^ad or interrupt acknowledgement; it 

Is high during Ta and Td cycles for a write. DT/R never changes state when DEN Is 
asserted. DT/R is driven HIGH during reset. 

BLAST/FAIL 

0 

T.S. 

BURST LAST indicates the last data cycle (Td) of a burst access. It is asserted low 
during the last Td and associated Tw cycles in a burst access. 

INITIALIZATION FAILURE Indicates that the processor has failed to initialize 
correctly. The failure state Is Indicated by a combination of BLAST asserted and both 
BE signals not asserted. This condition occurs after RESET is deasserted and before 
the first bus transaction begins. FAIL is asserted while the processor performs a self- 
test. If the self-test completes successfully, then FAIL is deasserted. Next, the 
processor performs a zero checksum on the first eight words of memory. If it fails, 

FAIL is asserted for a second time and remains asserted; if it passes, system 
initialization continues and FAIL remains deasserted. 


I/O = Input/Output, P = Output, 1 = Input, O.D. = Open-Drain, T.S. == J^-State. 
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Table 4. 80960SA and 80960SB Pin Description: Bus Signals (Continued) 


Symbol 

Type 

Name and Function 

RESET 

1 

RESET clears the internal logic of the processor and causes it to reinitialize. 

During RESET assertion, the input pins are ignored (except for INTO, INTI, INT3, 
LOCK), the tri-state output pins are placed in a HIGH impedance state (except for 
DT/R, DEN, and AS), and other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable reset. 
Optionally, for a synchronous reset, the LOW to HIGH transition of RESET should 
occur after the rising edge of both CLK2 and the external bus clock, and before the 
next rising edge of CLK2. 

The interrupt pins indicate the initializtion sequence executed. Typical Initialization 
requires driving only INTO and INT3 to a HIGH state. The reset conditions follow: 

INTO INTI INT3 LOCK Action Taken 

1x11 Run self-test (core initialization) 

0 0 1 1 Disable self-test 

0 1 X x Reserved 

X X 0 x Reserved 

X X X 0 ONCE mode (see LOCK pin) 

BE1-BE0 

o 

T.S. 

1 

BYTE ENABLE LINES specify which data bytes (up to two) on the bus take part in 
the current bus cycle. BE1 corresponds to AD1 5-AD8 and BEO corresponds to 
AD7-AD1 , DO. The byte enable lines are asserted appropriately during each data 
cycle. 

INITIALIZATION FAILURE Indicates that the processor has failed to initialize 
correctly. The failure state is indicated by a combination of BLAST asserted and 
both BE signals not asserted. This condition occurs after RESET is deasserted and 
before the first bus transaction begins. FAIL is asserted while the processor 
performs a self-test. If the self-test completes successfully, then FAIL is 
deasserted. Next, the processor performs a zero checksum on the first eight words 
of memory. If it fails, FAIL Is asserted for a second time and remains asserted; if It 
passes, system initialization continues and FAIL remains deasserted. 

nrro 

1 

INTERRUPT 0 indicates a pending interrupt. The bus interrupt control register 
determines in which way the signal should be interpreted. To signal an interrupt 
request in a synchronous system, this pin (as well as the other Interrupt pins) must 
be enabled by being deasserted for at least one bus cycle and then asserted for at 
least one additional bus cycle; In an asynchronous system, the pin must remain 
deasserted for at least two bus cycles and then be asserted for at least two more 
bus cycles. INTO is sampled during RESET to determine if the self-test sequence is 
to be executed. , 

INTI 

1 

INTERRUPT 1 indicates a direct interrupt, like InTO. INT1 is sampled during 

RESET to determine If the self-test sequence is to be executed. 

INT2/INTR 

1 

INTERRUPT 2/INTERRUPT REQUEST: The interrupt control register determines 
how this pin is interpreted. If INT2, It has the same interpretation as the INTO and 

INT1 pins. If INTR, it is used to receive an interrupt request from an external 8259A 
compatible interrupt controller. 

INT3/INTA 

I/O 

T.S. 

INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The interrupt control register 
determines how this pin is Interpreted. If INT3, It has the same interpretation as the 
INTO and INTI pins. If INTA, it is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles. INT3 must be pulled to a HIGH state during RESET. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = 3-State. 
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Table 4. 80960SA and 80960SB Pin Description: Bus Signals (Continued) 


Symbol 

Type 

Name and Function 

LOCK 

I/O 

O.D. 

BUS LOCK prevents other bus masters from gaining control of the bus following the 
current cycle (if they would assert LOCK to do so). LOCK is used by the processor or 
any bus agent when it performs Indivisible Read/Modify/WrIte (RMW) operations. Do 
not leave LOCK unconnected. It must be pulled HIGH for the processor to function 
properly. 

For a read that is designated as an RMW-read, LOCK is examined. If asserted, the 
processor waits until it is not asserted; if not asserted, the processor asserts LOCK 
during the Ta cycle and leaves it asserted. 

A write that is designated as an RMW-write deasserts LOCK In the Ta cycle. During 
the time LOCK Is asserted, a bus agent can perform a normal read or write but no 

RMW operations. LOCK is also held asserted during an interrupt-acknowledge 
transaction. 

ONCE MODE: The LOCK pin is sampled during reset. If it is asserted LOW at the end 
of RESET, all outputs will be 3-stated until the part is reset. ONCE MODE is used in 
conjunction with an ICE. 

HOLD 

I 

HOLD: HOLD indicates a request from a secondary bus master to acquire the bus. 
When the processor receives HOLD and grants another master control of the bus, it 
floats its tri-state bus lines and then asserts HLDA and enters the Th state. When 

HOLD is deasserted, the processor will deassert HLDA and go to either the Ti or Ta 
state. 

HLDA 

0 

T.S. 

HOLD ACKNOWLEDGE: HLDA indicates that bus control has been relinquished to 
another bus master. This signal is always driven. At RESET it is driven LOW. 

N.C. 

N/A 

NOT CONNECTED indicates pins should not be connected. Never connect any pin 
marked N.C. 


I/O = Input/Output, O = Output, I = Input, O.D: = Open-Drain, T.S. = 3-State. 


ELECTRICAL SPECIFICATIONS 
Power and Grounding 

The 80960SA and 80960SB are implemented in 
CHMOS IV technology and have modest power re- 
quirements. Their high clock frequency and numer- 
ous output buffers (address/data, control, error, and 
arbitration signals) can cause power surges as multi- 
ple output buffers drive new signal levels simulta- 
neously. For clean on-chip power distribution at high 
frequency, 1 2 Vcc and 1 3 Vss pins separately feed 
functional units of the 80960SA and 80960SB In the 
package. 

Power and ground connections must be made to all 
power and ground pins of the 80960SA and 
80960SB. On the circuit board, all Vcc pins must be 
strapped closely together, preferably on a power 


plane. Likewise, all Vss pins should be strapped to- 
gether, preferably on a ground plane. These pins 
may not be connected together within the chip. 


Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960SA and 80960SB. The processor can 
cause transient power surges when driving the bus, 
particularly when it is connected to a large capaci- 
tive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and decou- 
pling capacitors as much as possible. 
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Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 
one or more interrupt lines are not used, they should 
be deasserted. No inputs should ever be left float- 
ing. 

All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid V|h (^2.0V) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 1 0Ofl. The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 


Characteristic Curves 

The 80960SA and 80960SB characteristic curves 
shown in Figures 7 through 10 supply Information 
regarding typical supply currents, typical current ver- 
sus frequency, worst case voltage versus output cur- 
rent on open drain pins and capacitive derating 
curves. 

Figure 7 shows the typical supply current require- 
ments over the operating temperture range of the 


processor at supply voltage (Vcc) of 5V. Figure 8 
shows the typical power supply current (Ice) re- 
quired by the 80960SA and 80960SB at various op- 
erating frequencies when measured at three input 
voltage (Vcc) levels. 

For a given output current (Iql). the curve in Figure 9 
shows the worst case output low voltage (Vql)- Fig- 
ure 10 shows the typical capacitive derating curve 
for the 80960SA and 80960SB measured from 1 .5V 
on the system clock (CLK) to 0.8V on the falling 
edge and 2.0V on the rising edge of the bus ad- 
dress/data (AD) signals. 


Test Load Circuit 


Figure 11 illustrates load circuit used to test the 
80960SA and 80960SB’s 3-state pins, and Figure 12 
shows the load circuit used to test the open drain 
output. The open drain test uses an active load cir- 
cuit in the form of a matched diode bridge. Since the 
open-drain output sinks current, only the Iql legs of 
the bridge are necessary and the Iqh legs are not 
used. When the 80960SA and 80960SB driver under 
test is turned off, the output pin is pulled up to Vref 
(i.e., Vqh)- Diode D1 is turned off and the Iql current 
source flows through diode D2. 



When the 80960SA and 80960SB open-drain driver 
under test is on, diode D1 is also on, and the voltage 
on the pin being tested drops to Vql- Diode D2 turns 
off and Iql ^Iows through diode D1 . 



Low and High Current Drive Networks for the LOCK Pin 
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SUPPLY VOLTAGE (V) 

270917-8 


Figure 7. Typical Supply Current 
vs Supply Voltage 



Output Current on Open-Drain Pin 


3-STATE OUTPUT 


Cl = 

270917-12 


Figure 11. Test Load Circuit for 


3-State Output Pins 



0 5 10 15 

OPERATING FREQUENCY (MHz) 

[■^A.SV □@5.0V ♦^S.SV I 

270917-9 


Figure 8. Typical Current vs Frequency 




Open-Drain Output Pins 


3-14 







80960SA/80960SB 




iny. 


ABSOLUTE MAXIMUM RATINGS 

Operating Temperature 

(PLCC) 0°C to + 100°C Case 

Operating Temperature 

(QFP) O^Cto +100"CCase 

Storage Temperature -65“C to + 1 50®C 

Voltage on Any Pin (PLCC) . . . -0.5V to Vcc + 0-5V 
Voltage on Any Pin (QFP) . . -0.25V to Vcc + 0-25V 
Power Dissipation 1 .9W (16 MHz) 


NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


DC CHARACTERISTICS 

960SA/SB (10 MHz and 16 MHz): Tcase = to TIOO^’C, Vcc = 5V ±10% unless otherwise noted. 


Symbol 

Parameter 

Min 

Max 

Units 

Conditions 

V|L 

Input Low Voltage 

-0.3 

+ 0.8 

V 


V|H 

Input High Voltage 

2.0 

Vcc + 0.3 

V 


VcL 

CLK2 Input Low Voltage 

-0.3 

+ 0.8 

V 


VCH 

CLK2 Input High Voltage 

0.7 Vcc 

Vcc + 0.3 

V 


VoL 

Output Low Voltage 


0.45 

V 

Iql = 2.5 mA 




0.45 

V 

Iql = 12 mA, LOCK Pin 




0.60 

V 

lOL = 20 mA, LOCK Pin 


Output High Voltage 

2.4 


V 

AllTS, -2.5 mA(4) 

•cc 

Power Supply Current: 






10 MHz— QFP 


280 

mA 

o 

b 

11 

LU 

C/3 


10 MHz— PLCC 


280 

mA 

Tcase = O'C 


16 MHz— PLCC 


350 

mA 

Tcase = o°C 

•lo 

Output Leakage Current 


+ 15 

/xA 

(Note 5) 

Ili 

Input Leakage Current 


+ 15 

/xA 

0 ^ Vo ^ Vcc(2) 

C|N 

Input Capacitance 


10 

PF 

fc = 1 MHz(3) 

Co 

I/O or Output Capacitance 


12 

PF 

fc = 1 MHz(3) 

CcLK 

Clock Capacitance 


10 

pF 

fc = 1 MHz(3) 


NOTES: 

1. Tcase is specified at 0“C to + 100°C for the QFP at 10 MHz and Vcc = 5V ± 5%. 

2. INTO has an internal pullup that sources 100 juiA. 

3. Input, output and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. Lock has an internal pullup that sources 100 jaA. 
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AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960SA and 80960SB pins. All Input and output 
timings are specified relative to the 1 .5V level of the 
rising edge of CLK2, and refer to the time at which 


the signal crosses (for output delay and input setup) 
1 .5V. All AC testing should be done with input volt- 
ages of 0.4V and 2.4V, except for the clock (CLK2), 
which should be tested with input voltages of 0.45V 
and 0.7 * Vcc- See Figure 1 3 for timing relationships 
for the 80960SA and 80960SB signals. 


CLK2 

OUTPUTS: 
AD(l:15),A(lj3),D0 
A(16:31).B E(0;1) 
DEN, BLAST 

^ 

HLDA, LOCK, INTA 

ALE 


DT/R 


AS 


INPUTS; 
AD(_1M 5), DO 
INTO.Jh^l 
INT2/INTR, INT3 

HOLD 

LOCK 

READY 



270917-14 


Figure 13. Drive Leveis and Timing Relationships of 80960SA and 80960SB Signals 
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AC Specification Tables 


80960SA and 80960SB AC Characteristics (10 MHz) 


Symboi 

Parameter 

Min 

Max 

Units 

Test Conditions 

T1 

Processor Clock 

Period (CLK2) 

50 

125 

ns 

V|N = 1.5V 

T2 

Processor Clock Low 
Time (CLK2) 

8 


ns 

Vj = 10% Point 

= VcL + (VcH - Vcl) X 0.1 

T3 

Processor Clock High 
Time (CLK2) 

8 


ns 

Vt = 90% Point 

= Vcl + (VcH ” Vcl) x 0.9 

T4 

Processor Clock Fall 

Time (CLK2) 


10 

ns 

Vt = 90% Point to 10% Point(3) 

T5 

Processor Clock Rise 
Time (CLK2) 


10 

ns 

Vt = 10% Point to 90% PointO) 

T6 

Output Valid Delay 

2 

31 

ns 

Cl = 100 pF (AD and Control) 

T6AS 


2 

25 

ns 

Cl = 50 pF 

T7 

ALE Width 

24 


ns 

Cl = 100 pF 

T8 


4 


ns 

Cl = lOOpFd) 

T9 

Output Float Delay 

2 

20 


Cl = 100 pF (AD) 

Cl = 100 pF (Controls)(i) 

T10 

Input Setup 1 

10 


ns 


Til 

Input Hold 

2 


ns 

(Note 4) 

T12 

Input Setup 2 

13 


ns 


T13 

Setup to ALE Inactive 

10 


ns 

Cl = 100 pF 

T14 

Hold after ALE Inactive 

8 


ns 

Cl = 100 pF 

T15 

RESET Hold 

3 


ns 

(Note 2) 

T16 

RESET Setup 

5 


ns 

(Note 2) 

T17 

RESET Width 

2050 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. A float condition occurs when the maximum output current becomes less than ILO. Float delay is not tested, but should 
be no long er than t he valid delay. 

2. Meeting RESET setup and hold times is an optional method of synchroni^g your clocks. If you decide to use an asyn- 
chronous reset, then synchronizing the clock can be accomplished by using AS. 

3. Processor clock (CLK2) rise time and fall time are not tested. 

4. ICE requires a minimum of 4 ns input hold time. 
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80960SA and 80960SB AC Characteristics (16 MHz PLCC) 


Symbol 

Parameter 

— 

Min 

Max 

Units 

Test Conditions 

T1 

Processor Clock 

Period (CLK2) 

31.25 

125 

ns 

V|N = 1.5V 

T2 

Processor Clock Low 
Time (CLK2) 

8 


ns 

Vt= 10% Point 

= VcL + (VcH ” Vcl) X 0.1 

T3 

Processor Clock High 
Time (CLK2) 

8 


ns 

Vj = 90% Point 

= Vcl + (VcH “ Vcl) x 0.9 

T4 

Processor Clock Fall 

Time (CLK2) 


10 

ns 

Vt = 90% Point to 10% PointO) 

T5 

Processor Clock Rise 
Time (CLK2) 


10 

ns 

Vt = 10% Point to 90% Point(3) 

T6 

Output Valid Delay 

2 

25 

ns 

Cl = 100 pF (AD and Control) 

T6AS 

AS Output Valid Delay 

2 

21 

ns 

Cl = 50 pF 

T7 

ALE Width 

15 


ns 

Cl = 100 pF 

T8 

ALE Output Valid Delay 

2 

22 

ns 

Cl = lOOpFd) 

T9 

Output Float Delay 

2 

20 

ns 

Cl=100pF(AD) 

Cl = 100 pF (Controls)(i) 

T10 

Input Setup 1 

10 


ns 


Til 

Input Hold 

2 


ns 

(Note 4) 

T12 

Input Setup 2 

13 


ns 


T13 

Setup to ALE Inactive 

10 


ns 

Cl = 100 pF 

T14 

Hold after ALE Inactive 

8 


ns 

LL 

Q. 

O 

O 

II 

-J 

o 

T15 

RESET Hold 

3 


ns 

(Note 2) 

T16 

RESET Setup 

5 


ns 

(Note 2) 

T17 

RESET Width 

1281 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. A float condition occurs when the maximum output current becomes less than ILO. Float delay is not tested, but should 
be no long er than t he valid delay. 

2. Meeting RESET setup and hold times is an optional method of synchroni^g your clocks. If you decide to use an asyn- 
chronous reset, then synchronizing the clock can be accomplished by using AS. 

3. Processor clock (CLK2) rise time and fall time are not tested. 

4. ICE requires a minimum of 4 ns input hold time. 
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CLK 

CLK2 

A(4:15)/D(0;15) 

A(1:3) 

BE(0:1) 

A(16;31) 

ALE 

BL^ 

w/r 

dt/r 

DEN 

READY 



270917-15 


NOTES: 

1. The AD and control signals are driven at all times except during a HOLD acknowledge (HLDA asserted) RESET, and 
ONCE mode. 

2. The AD and control signals may toggle during idle (Ti) or recovery (Tr) cycles. 


Figure 14. Timing Relationships of the 80960SA and 80960SB Bus 
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First 

A B C D 




1 . The A edge is defined as the first rising CLK2 edge after RESET is deasserted meeting the RESET hold and setup 
times. 

2. Initialization Parameters must be setup at least four CLK2s prior to the first A edge. 


Figure 15. RESET Signal Timing 



Design Considerations 

Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quen t output from the processor is deasserted (e.g., 
DEN becomes deasserted). 

Whenever the processor generates an output that 
indicates a transition into a subsequent state, any 
outputs that are specified to be 3-stated in this new 
state are guaranteed to be 3-stated. For example, in 
the Td cycle following a Ta c ycle for a read, the 
minimum output delay of DEN is 2 ns, but the max- 


imum float time of AD is 20 ns. When DEN Is assert- 
ed, however, the AD outputs are guaranteed to have 
been 3-stated. 


Designing for the ICE>960SB 

The 80960SA and 80960SB In-Circuit Emulator as- 
sists in debugging 80960SA and 80960SB hardware 
and software designs. The product consists of a 
probe module, cable, control unit and power supply. 
Because of the high operating frequency of the 
80960SA and 80960SB systems, the probe module 


3-20 









onocncA /onofincia 


IPI^IllLOOiiaOKlAiV 



connects directly to the 80960SA and 80960SB 
component (EIAJ QFP or PLCC) or a socket for the 
PLCC. 

When designing an 80960SA and 80960SB hard- 
ware system that uses the ICE-960SB to debug the 
system, several electrical and mechanical character- 
istics should be considered. These considerations 
include capacitive loading, drive requirement, power 
requirement, and physical layout. 

The ICE-960SB probe module increases the load 
capacitance of each line by up to 25 pF. This load 
originates from the probe module and are driven by 
the 80960SA and 80960SB processor. 

To achieve high noise immunity, the ICE-960SB 
probe is powered by the user’s system. The high- 
speed probe circuitry draws up to 1 .1 A plus the max- 
imum current (Ice) of the 80960SA and 80960SB 
processor. 

The AD bus should not be driven by an external 
source unless DEN is asserted. In addition, the ICE 
requires a minimum data hold time of 4 ns. 

The ICE960SB probe will drive LOCK to a LOW 
state during RESET to force the target 80960SA and 
80960SB to enter ONCE mode. To guarantee tim- 
ings, the ICE requires ±5% supply voltage supplied 
to the 80960SA and 80960SB. The ICE probe re- 
quires a minimum of 0.25 iriches clearance on all 
sides of both the EIAJ QFP and PLCC. 


Lock Line Termination 

You must terminate the LOCK line as described in 
Figure 6 in order for the ICE to properly function. 

MECHANICAL DATA 

Package Dimensions and Mounting 

The 80960SA and 80960SB is available in two differ- 
ent packages: an 80-lead quad flat pack (EIAJ QFP), 
shown In Figure 17, and an 84-lead plastic leaded 
chip carrier (PLCC), shown in Figure 1 8. 

Pin Assignment 

The QFP and PLCC have different pin assignments. 
The QPF pins are numbered in order from 1 to 80 
around the package’s perimeter. The PLCC pins are 


numbered in order from 1 to 84 around the pack- 
age’s perimeter. Tables 9 and 10 list the function of 
each pin in the QFP. Tables 1 1 and 12 list the func- 
tion of each pin in the PLCC. 

Vcc and GND connection must be made to multiple 
Vcc and GND pins. Each Vcc and GND pin must be 
connected to the appropriate voltage or ground and 
externally strapped close to the package. We rec- 
ommend that you include separate power and 
ground planes in your circuit board for power distri- 
bution. 

NOTE: 

Pins identified as N.C., “No Connect,” should never 
be connected. The 80960SA and 80960SB QFP 
package contains two N.C. pins and PLCC package 
contains six N.C. pins. 

Package Thermal Specification 

The 80960SA and 80960SB is specified for opera- 
tion when case temperature is within the range 0°C 
to + 85°C. The case temperature should be mea- 
sured at the top center of the package. 

The ambient temperture can be calculated from 0jc 
and 0JA by using the following equations: 

Tj = Tc + P 0JC 
Ta = Tj - P * ^JA 
Tc = Ta + P * [0JA-^Jcl 

Values for 0 ja and 0jc are given in Table 7 for the 
QFP package and in Table 8 for the PLCC package 
for various airflows. 

Example: 

Ta = To - P * (0JA - 0jc) 

Tc = Maximum Case Temperature 

P = Maximum Supply Voltage times Iqc 
at 100" and 10 MHz 

0JA and Ojc = QFP Package Thermal Resistance 
at 0 ft/m airflow 

Ta = 51 = 100 - (5.5 * 0.213) * (45.7 - 4) 

WAVEFORMS 

Figure 1 9 through 22 shows the waveforms for vari- 
ous signals on the 80960SA and 80960SB’s bus. 
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Table 7. QFP Package, Thermal Resistance— ®C/ Watt 



NOTES: 

1 . This table applies to an 80960SA and 80960SB QFP soldered directly onto a board. 

2. ^JA = ^JC + ^CA- 

3. Thermal data are based on copper lead frames. 

Table 8. PLCC Package, Thermal Resistance— X/Watt 



Airflow — ft/min 

Parameter 

1 

0 

50 

100 

200 

400 

600 

800 

1000 

0JA Junction to Ambient 
(No Heatsink) 

33 

na 

na 

27 

23.8 

22 

20 

19.5 

9jc Junction to Case 

13 

na 

na 

na 

na 

na 

na 

na 


NOTES: 

1 . This table applies to an 80960SA and 80960SB PLCC soldered directly onto a board. 

2. 0JA = 0JC + ^CA- 



270917-18 Figure 18. 84-Lead Plastic Leaded Chip Carrier 


Figure 17. 80-Lead EIAJ Quad Flat Pack Package 
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Table 9. 80960SA and 80960SB QFP Pinout — In Pin Order 


Pin 

Signal 

1 

A22 

2 

A21 

3 

A20 

4 


5 


6 


7 

A16 

8 


9 


10 


11 


12 


13 


14 

ADI 3 

15 

AD12 

16 

AD11 

17 

AD10 

18 

AD9 

19 

AD8 

20 

AD7 


Pin 

Signal 

61 

Vcc 

62 

Vss 

63 

N.C. 

64 

AS 

65 

Vss 

66 

ALE 

67 


68 

A31 

69 

A30 

70 

A29 

71 

A28 

72 

Vss 

73 


74 

A27 

75 

A26 

76 

A25 

77 


78 

Vss 

79 

A24 

80 

A23 


Pin 

Signal 

41 

BEO 

42 

Vcc 

43 

Vss 

44 

CLK2 

45 

RESET 

46 

INTO 

47 

INTI 

48 

INT2/INTR 

49 

INT3/INTA 

50 

HLDA 

51 

Vcc 

52 

Vss 

53 

1 

HOLD 

54 

W/R 

55 

D^ 

56 

DT/R 

57 

BLAST 

58 

LOCK 

59 

Vcc 

60 

Vss 


Pin 

Signal 

21 

Vcc 



23 


24 


25 

AD6 

26 

AD5 

27 

AD4 

28 

AD3 

29 

AD2 

30 

AD1 

31 

DO 

32 


33 

Vcc 

34 

A3 

35 

A2 

36 

Vcc 

37 

Vss 

28 

A1 

39 

N.C. 

40 

BP 


Table 10. 80960SA and 80960SB QFP Pinout — In Signal Order 


Signal 

Pin 

Vcc 

51 

Vcc 

59 

Vcc 

61 

Vcc 

73 

Vcc 

77 

Vcc 

8 

Vss 

13 

Vss 

22 


24 




37 


43 


52 

BEBBI 

60 

Vss 

62 

Vss 

72 

Vss 

78 

Vss 

9 

Vss 

65 

W/R 

54 


Signal 

Pin 

A1 

38 

A2 

35 

. A3 

34 

ADI 

30 

AD2 

29 

AD3 

28 

AD4 

27 

AD5 

26 

AD6 

25 

AD7 

20 

AD8 

19 

AD9 

18 

AD10 

17 

AD11 

16 


15 


14 


11 


10 

A16 

-7 

A17 

6 


Signal 

Pin 

A18 

5 

A19 

4 

A20 

3 

A21 

2 

A22 

1 

A23 

80 

A24 

79 

A25 

76 

A26 

75 

A27 

74 

A28 

71 

A29 

70 

A30 . 

69 

A31 

68 

ALE 

66 

AS 

64 

BEO 

41 

BET 

40 


57 

CLK2 

44 


Signal 

Pin 

DO 

31 

DEN 

55 

DT/R 

56 

HLDA 





46 

INTI 

47 


48 


mm 



N.C. 


N.C. 

13391 


Hi 


45 

Vcc 

12 

Vcc 

21 

Vcc 

^^1 

Vcc 

iil 

Vcc 

36 

Vcc 

42 
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Table 11. 80960SA and 80960SB PLCC Pinout— In Pin Order 


Pin 

, Signal 

1 

Vcc 

2 

N.C. 

3 

A27 

4 

A26 

5 

A25 

6 

Vcc 

7 

Vss 

8 

A24 

9 

A23 

10 

mmm 

11 

A21 

12 

A20 

13 

A19 

14 

A18 

15 

A17 

16 

A16 

17 

Vcc 

18 

Vss 

19 

ADI 5 

20 

ADI 4 

21 

Vcc 


Pin 

Signal 

22 

Vss 

23 

N.C. 

24 . 

ADI 3 

25 

ADI 2 

26 

AD11 

27 

AD10 

28 

AD9 

29 

AD8 

30 

AD7 

31 

Vcc 

32 

Vss 

33 

Vcc 

34 

Vss 

35 

AD6 

36 

AD5 

37 

AD4 

38 

AD3 

39 

AD2 

40 

ADI 

41 

DO 

42 

N.C. 


Pin 

Signal 

43 

Vss 

44 

Vcc 

45 

A3 





48 

Vss 

49 

A1 

50 

N.C. 

51 

MT 

52 

BEO 

53 

Vcc 

54 

Vss 

55 

CLK2 

56 

RESET 

57 

Into 

58 

InTT 

59 

INT2/INTR 

60 

INt3/INTA 

61 

HLDA 

62 

Vcc 

63 

Vss 


Pin 

Signal 

64 

HOLD 

65 

N.C. 

66 

W/R 

67 

DEN 

68 

DT/R 

69 

BLAST 

70 

LOCK 

71 

Vcc 

72 

Vss 

73 

Vcc 

74 

Vss 

75 

N.C. 

76 

AS 

77 

Vss 

78 

ALE 

79 

READY 

80 

A31 

81 

A30 

82 

A29 

83 

A28 

84 

Vss 
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Table 12. 80960SA and 80960SB PLCC Pinout— In Signal Order 


Signal 

Pin 

A18 

14 

A19 

13 

A20 

12 

A21 

11 

A22 

10 

A23 

9 

A24 

8 

A25 

5 

A26 

4 

A27 

3 

A28 

83 

A29 

82 

A30 

81 

A31 

80 

ALE 

78 

Js 

76 


52 

BET 

51 

BLAST 

69 

CLK2 

55 

DEN 

67 


Signal 

Pin 

DT/R 

68 

HLDA 

61 

HOLD 

64 

INTO 

57 

INTI 

58 

INT2/INTR 

59 

IfTfs/iNTA 

60 

LOCK 

70 

N.C. 

2 

N.C. 

23 

N.C. 

42 

N.C. 

50 

N.C. 

65 

N.C. 

. 

75 

READY 

79 

RESET 

56 

Vcc 

1 

Vcc 

17 

Vcc 

21 

Vcc 

31 

Vcc 

33 


Signal 

Pin 

Vcc 

44 

Vcc 

47 

Vcc 

53 

Vcc 

6 

Vcc 

62 

Vcc 

71 

Vcc 

73 

Vss 

18 

Vss 

22 

Vss 

32 

Vss 

34 

Vss 

43 

Vss 

48 

Vss 

54 

Vss 

63 

Vss 

7 

Vss 

72 

Vss 

74 

Vss 

77 

Vss 

84 

W/R 

66 


Signal 

Pin 

A1 

49 

A2 

46 

A3 

45 

DO 

41 

ADI 

40 

AD2 

39 

AD3 

38 

AD4 

37 

AD5 

36 

AD6 

35 

AD7 

30 

AD8 

29 

AD9 

28 

AD10 

27 

AD11 

26 

ADI 2 

25 

AD13 

24 

AD14 

20 

AD15 

19 

ADI 6 

16 

A17 

15 
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Figure 20. 80960SA and 80960SB Timing 
Showing a Four Word Aligned Read Burst 
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Figure 21. 80960SA and 80960SB Double Word Read Timing with Wait States 
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Figure 22. 80960SA and 80960SB Aligned Double Word Write Timing with Wait States 
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i 960 TM KA/KB processor 
PRODUCT OVERVIEW 


INTRODUCTION 

This chapter provides an overview of the Intel i960 KB 
processor (which is part of the i960 K series of embed- 
ded-processor products). 

All of the processors in the i960 K series of products 
are based on the Intel i960TM architecture. Most of the 
information in this overview also applies to the i960 
KA processor. The only difference between the i960 
KB and i960 KA processors is that the i960 KA proc- 
essor does not provide on-chip support for floating- 
point operations or operations on decimal numbers. 


OVERVIEW OF THE i 960 TM KB 
ARCHITECTURE 

The i960 KB processor introduced the i960 architec- 
ture — a new 32-bit architecture from Intel. This archi- 
tecture has been designed to meet the needs of embed- 
ded applications such as machine control, robotics, 
process control, avionics and instrumentation. 

The i960 architecture can best be characterized as a 
high-performance computing engine. It features high- 
speed instruction execution and ease of programming. 
It is also easily extensible, allowing processors and con- 
trollers based on this architecture to be conveniently 
customized to meet the needs of specific processing and 
control applications. 

The following are some of the important attributes of 
the i960 architecture: 

• full 32-bit registers 

• high-speed, pipelined instruction execution 

• a convenient program execution environment with 
32 general-purpose registers and a versatile set of 
special-function registers 

• a highly optimized procedure call mechanism that 
features on-chip caching of local variables and pa- 
rameters 

• extensive facilities for handling interrupts and faults 

• extensive tracing facilities to support efficient pro- 
gram debugging and monitoring 

• register scoreboarding and write buffering to permit 
efficient operation when used with lower perform- 
ance memory subsystems 


OVERVIEW OF THE SINGLE 
PROCESSOR SYSTEM 
ARCHITECTURE 

The central processing module, memory module and 
I/O module form the natural boundaries for the hard- 
ware system architecture. The modules are connected 
together by the high bandwidth 32-bit multiplexed 
L-bus, which can transfer data at a maximum sustained 
rate of 53 Mbytes per second for an i960 processor op- 
erating at 20 MHz. 

Figure 1 shows a simplified block diagram of one possi- 
ble system configuration. The heart of this system is the 
i960 KB processor, which fetches instructions, executes 
code, manipulates stored information and interacts 
with I/O devices. The high bandwidth L-bus connects 
the i960 KB processor to memory and I/O modules. 
The i960 KB processor stores system data, instructions 
and programs in the memory module. By accessing var- 
ious peripheral devices in the I/O module, the i960 KB 
processor supports communication to terminals, mo- 
dems, printers, disks and other I/O devices. 


igeOTM kb Processor and the L-Bus 

The i960 KB processor performs bus operations using 
multiplexed address and data signals, and provides all 
the necessary control signals. For example st andar d 
control signals, such as Add ress Latch Enable (ALE), 
Address/Data Status (ADS), Write/Read Command 
(W/ R), Da ta Transmit/Receive (DT/R) and Data En- 
able (DEN), are provided by the i960 KB processor. 
The i960 processor also generates byte enable signals 
that specify which bytes on the 32-bit data lines are 
valid for the transfer. 

The L-bus supports burst transactions, which access up 
to four data words at a maximum rate of one word per 
clock cycle. The i960 KB processor uses the two low- 
order address lines to indicate how many words are to 
be transferred. The i960 KB processor performs burst 
transactions to load the on-chip 512-byte instruction 
cache to minimize memory accesses for instruction 
fetches. Burst transactions can also be used for data 
access. 

To transfer control of the bus to an external bus master, 
the i960 KB provides two arbitration signals: hold re- 
quest (HOLD) and hold acknowledge (HLDA). After 
receiving HOLD, the processor grants control of the 
bus to an external master by asserting HLDA. 
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Figure 1. Basic i 960 TM kb System Configuration 


The i960 KB processor provides a flexible interrupt 
structure by using an on-chip interrupt controller, an 
external interrupt controller or both. The type of inter- 
rupt structure is specified by an internal interrupt vec- 
tor register. For a system with multiple processors, 
another method is available, called inter-agent commu- 
nication (lAC) where a processor can interrupt another 
processor by sending an lAC message. 

Memory Module 

A memory module can consist of a memory controller. 
Erasable Programmable Read Only Memory 
(EPROM), and static or dynamic Random Access 
Memory (RMA). The memory controller first condi- 
tions the L-bus signals for memory operation. It demul- 
tiplexes the address and data lines, generates the chip 
select signals from the address, detects the start of the 
cycle for burst mode operation and latches the byte 
enable signals. 

The memory controller generates the control signals for 
EPROM, SRAM and DRAM. Specifically, it provides 
the control signals, multiplexed row/column address 
and refresh control for dynamic RAMs. The controller 


can be designed to accommodate the burst transaction 
of the i960 KB processor by using the static column 
mode or nibble mode features of the dynamic RAM. In 
addition to supplying the operational signals, the con- 
troller generates the READY signal to indicate that 
data can be transferred to or from the i960 KB proces- 
sor. 

The i960 KB processor directly addresses up to 
4 Gbytes of physical memory. The processor does not 
allow burst accesses to cross a 16-byte boundary, to 
ease the design of the controller. Each address specifies 
a four-byte data word within the block. Individual data 
bytes can be accessed by using the four byte-enable sig- 
nals from the i960 KB processor. Chapter 5 provides 
design guidelines for the memory controller. 

I/O Module 

The I/O module consists of the I/O components and 
the interface circuit. I/O components can be used to 
allow the i960 KB processor to use most of its clock 
cycles for computational and system management ac- 
tivities. Time consuming tasks can be off-loaded to spe- 
cialized slave-type components, such as the 8259A Pro- 


3-30 






i 960 TM KA/KB processor PRODUCT OVERVIEW 


inl^. 


grammable Interrupt Controller or the 82530 Serial 
Communication Controller. Some tasks may require a 
master-type component, such as the 82586 Local Area 
Network Control. 

The interface circuit performs several functions. It de- 
multiplexes the address and data lines, generates the 
chip select signals from the address, produces the I/O 
read or I/O write command from the processor’s W/R 
signal, latches the byte enable signals and generates the 
READY signals. Since some of these functions are 
identical to those of the memory controller, the same 
logic can be used for both interfaces. For master-type 
peripherals that operate on a 16-bit data bus, the inter- 
face circuit translates the 32-bit data bus to a 16-bit 
data bus. 

The i960 KB processor uses memory-mapped addresses 
to access I/O devices. This allows the CPU to use many 
of the same instuctions to exchange information for 
both memory and peripheral devices. Thus, the power- 
ful memory-type instructions can be used to perform 8-, 
16- and 32-bit data transfers. 


HIGH PERFORMANCE PROGRAM 
EXECUTION 

Much of the design of the i960 architecture has been 
aimed at maximizing the processor’s computational 
and data processing speed through the use of increased 
parallelism. The following paragraphs describe several 
of the mechanisms and techniques used to accomplish 
this goal. 


Load and Store Model 

One of the more important features of the i960 archi- 
tecture is its performance of most operations on oper- 
ands in registers, rather than in memory. For example, 
all arithmetic, logic, comparison, branching and bit op- 
erations are performed with registers and literals. 

This feature provides two benefits. First, it increases 
program execution speed by minimizing the number of 
memory accesses necessary to execute a program. Sec- 
ond, it reduces the memory latency encountered when 
using slower, lower-cost memory parts. 

To support this concept, the architecture provides a 
generous supply of general-purpose registers. For each 
procedure, 32 registers are available, 28 of which are 
available for general use. These registers are divided 
into two types: global and local. Both types of registers 
can be used for general storage of operands. The only 
difference is that global registers retain their contents 
across procedure boundaries, whereas the processor al- 
locates a new set of local registers each time a new 
procedure is called. 


The architecture also provides a set of fast, versatile 
load and store instructions. These instructions allow 
burst transfers of 1, 2, 4, 8, 12 or 16 bytes of informa- 
tion between memory and the registers. 


On-Chip Caching of Code and Data 

To further reduce memory accesses, the architecture 
offers two mechanisms for caching code and data on 
chip: an instruction cache and multiple sets of local 
registers. The instruction cache allows prefetching of 
blocks of instruction from memory. This helps ensure 
that the instruction execution pipeline is supplied with 
a steady stream of instructions. It also reduces the 
number of memory accesses required when performing 
iterative operations such as loops. The architecture al- 
lows the size of the instruction cache to vary. For the 
i960 KB processor, it is 512 bytes. 

To optimize the architecture’s procedure call mehan- 
ism, the processor provides multiple sets of local regis- 
ters. This allows the processor to perform procedure 
calls without having to write the local registers out to 
the stack in memory. The number of register sets de- 
pends on the processor implementation. The i960 KB 
processor provides four sets of local registers. 


Overlapped Instruction Execution 

The i960 architecture also enchances program execu- 
tion speed by overlapping the execution of some in- 
structions. In the i960 K series of processors, this is 
accomplished through register scoreboarding. 

Register scoreboarding permits instruction execution to 
continue while data is being fetched from memory. 
When a load instruction is executed, the processor sets 
one or more scoreboard bits to indicate the target regis- 
ters to be loaded. After the target registers are loaded, 
the scoreboard bits are cleared. While the target regis- 
ters are being loaded, the processor is allowed to exe- 
cute other instructions that do not use these registers. 

The processor uses the scoreboard bits to ensure that 
the target registers are not used until the load is com- 
plete. (Scoreboard bits are checked transparently from 
software.) This technique allows code to be executed 
such that some instructions can be executed in zero 
clock cycles (that is, executed for free). 


Single-Clock Instructions 

The i960 architecture is designed to let a processor exe- 
cute commonly used instructions, such as moves, adds, 
subtracts, logical operations and branches, in a mini- 
mum number of clock cycles (preferably one cycle). 
The architecture supports this concept in several 
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ways. For example, the load and store model described 
earlier eliminates the clock cycles required to perform 
memory-to-memory operations, by concentrating on 
register-to-register operations. 

In addition, all of the instructions in the i960 architec- 
ture are 32 bits long and aligned on 32-bit boundaries. 
This lets instructions be decoded in one clock cycle, 
and eliminates the need for an instruction-alignment 
stage in the pipeline. 

The i960 KB processor takes full advantage of these 
features of the architecture, resulting in more than 50 
instructions that can be executed in a single clock cycle. 


Efficient interrupt Modei 

The i960 architecture provides an efficient mechanism 
for servicing interrupts from external sources. To han- 
dle interrupts, the processor maintains an interrupt ta- 
ble of 248 interrupt vectors, 240 of which are available 
for general use. When an interrupt is signaled, the proc- 
essor uses a pointer to the interrupt table to perform an 
implicit call to an interrupt handler procedure. In per- 
forming this call, the processor automatically saves the 
state of the processor prior to receiving the interrupt, 
performs the interrupt routine, then restores the state of 
the processor. A separate interrupt stack is also provid- 
ed to segregate interrupt handling from application 
programs. 

The interrupt handling facilities also allow interrupts to 
be evaluated by priority. The processor is then able to 
store interrupt vectors that are lower in priority than 
the current processor task in a pending interrupt sec- 
tion of the interrupt table. The processor checks and 
services the pending interrupts at defined times. 


SIMPLIFIED PROGRAMMING 
ENVIRONMENT 

Because of its streamlined execution environment, 
processors based on the i960 architecture are particu- 
larly easy to program. The following paragraphs de- 
scribe some of the architecture features that simplify 
programming. 

Highly Efficient Procedure Call 
Mechanism 

The procedure call mechanism makes procedure calls 
and parameter passing between procedures simple and 
compact. Each time a call instruction is issued, the 
processor automatically saves the current set of local 
registers and allocates a new set for the called proce- 
dure. Likewise, on a return from a procedure, the cur- 
rent set of local registers is deallocated and the local 


registers for the procedure being returned to are re- 
stored. This means a program never has to explicitly 
save and restore those local variables that are stored in 
local registers. 


Versatile Instruction Set and 
Addressing 

The selection of instructions and addressing modes also 
simplifies programming. A full set of load, store, move, 
arithmetic, comparison and branch instructions are 
provided, with operations on both integer and ordinal 
data types. Operations on bits and bit strings are simpli- 
fied by a complete set of Boolean and bit-field instruc- 
tions. 

The addressing modes are efficient and straightforward, 
while at the same time providing the necessary indexing 
and scaling modes required to address complex arrays 
and record structures. The large 4-gigabyte address 
space provides ample room to store programs and data. 
The availability of 32 addressing lines allows some ad- 
dress lines to be memory-mapped to control hardware 
functions. 


Extensive Fault Handling Capability 

To aid in program development, the i960 architecture 
defines a wide range of faults that the processor detects, 
including, arithmetic, faults, invalid operations, invalid 
operands and machine faults. When a fault is detected, 
the processor makes an implicit call to a fault handler 
routine, in a way similar to the interrupt mechanism 
described previously. The information collected for 
each fault allows program developers to quickly correct 
faulting code, and allows automatic recovery from 
some faults. 


Debugging and Monitoring 

To support debugging systems, the i960 architecture 
provides a mechanism for monitoring processor activity 
by means of trace events. When the processor detects a 
trace event, it signals a trace fault and calls a fault han- 
dler. Intel provides several tools that use this feature, 
including an in-circuit emulator (ICE) device. 


SUPPORT FOR ARCHITECTURAL 
EXENSIONS 

The i960 architecture provides several features that en- 
able processors based on this architecture to be easily 
customized to meet the needs of specific embedded ap- 
plications, stich as signal processing, array processing 
or graphics processing. 
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The most important of these features is the set of 32 
special function registers. These regisers provide a con- 
venient interface to circuitry in the processor or pins 
that can be connected to external hardware. They can 
be used to control timers, to perform operations on spe- 
cial data types or to perform I/O functions. The special 
function registers are similar to the global registers. 
They can be addressed by all of the register access in- 
structions. 


EXTENSIONS INCLUDED IN THE 
i 960 TM K SERIES PROCESSORS 


ing add, subtract, multiply, divide, trigonometric func- 
tions and logarithmic functions. These operations are 
performed on single precision (32-bit), double precision 
(64-bit) and extended precision (80-bit) real numbers. 

One of the benefits of this implementation is that the 
floating-point handling facilities are integrated into the 
normal instruction execution environment. Single and 
double precision floating-point values are stored in the 
same regisiers as non-floating point values. Four 80-bit 
floating-point registers are provided to hold extended- 
precision values. 


The i960 K series of processors provides a complete 
implementation of the i960 architecture, plus several 
extensions to that architecture. These extensions fall 
into two categories: floating-point processing and inter- 
agent communication. 


On>Chip Floating Point 

The i960 KB processor provides a complete implemen- 
tation of the IEEE standard for binary floating-point 
arithmetic (IEEE 754-185). This implementation in- 
cludes a full set of floating-point operations, includ- 


Interagent Communication 


All of the processors in the i960 K series provide an 
inter-agent communication (lAC) mechanism, allowing 
agents connected to the processor’s bus to communi- 
cate with one another. This mechanism operates simi- 
larly to the interrupt mechanism, except that lAC mes- 
sages are passed through dedicated sections of memory. 
The sort of tasks handled with lAC messages are proc- 
essor reinitialization, stopping the processor, purging 
the instruction cache and forcing the processor to check 
pending interrupts. 
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EMBEDDED 32-BIT PROCESSOR 


■ High-Performance Embedded 
Architecture 

— 25 MIPS Burst Execution at 25 MHz 

— 9.4 MIPS* Sustained Execution at 
25 MHz 

■ 512-Byte On-Chip Instruction Cache 
— Direct Mapped 

— Paraiiei Load/Decode for Uncached 
Instructions 

■ Pin Compatible with 80960KB 

■ Multipie Register Sets 

— Sixteen Global 32-Bit Registers 
— Sixteen Local 32-Bit Registers 
— Four Local Register Sets Stored 
On-Chip 

— - Register Scoreboarding 


■ Built-In Interrupt Controller 

— 32 Priority Levels 256 Vectors 

— 3.4 jLLS Latency @ 25 MHz 

■ Easy to Use, High Bandwidth 32-Bit Bus 

— 66.7 Mbytes/s Burst 

— Up to 16-Bytes Transferred per Burst 

■ 4 Gigabyte, Linear Address Space 

■ 132-Lead Pin Grid Array (PGA) Package 

■ 132-Lead Plastic Quad Flat Pack (PQFP) 

■ Uses 85C960 Bus Controller 

■ Supported by 27960KX Burst EPROMs 


The 80960KA is a member of Intel’s new 32-bit processor family, the i960 series, which is designed especially 
for embedded applications. It is based on the family’s high performance, common core architecture, and 
includes a 512-byte instruction cache and a built-in interrupt controller. The 80960KA has a large register set, 
multiple parallel execution units and a high-bandwidth, burst bus. Using advanced RISC technology, this high 
performance processor Is capable of execution rates in excess of 9.4 million instructions per second.* The 
80960KA Is well-suited for a wide range of embedded applications, including laser printers, image processing, 
industrial control, robotics and telecommunications. 

"Relative to Digital Equipment Corporation’s VAX-1 1/780** at 1 MIPS 



Figure 1. The 80960KA’s Highly Paraiiei Microarchitecture 


270775-1 


** VAX-1 1TM is a trademark of Digital Equipment Corporation. 


3-34 


September 1991 
Order Number: 270775-004 









80960KA 


int^. 


THE 960 SERIES 

The 80960KA is a member of a new family of 32-bit 
microprocessors from Intel known as the I960 Se- 
ries. This series was especially designed to serve 
the needs of embedded applications. The embed- 
ded market includes applications as diverse as in- 
dustrial automation, avionics, image processing, 
graphics, robotics, telecommunications and automo- 
biles. These types of applications require high 
integration, low power consumption, quick Interrupt 
response times and high performance. Since time to 
market Is critical, embedded microprocessors need 
to be easy to use in both hardware and software 
designs. 


All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor In the series will add Its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications In the embedded 
market. For example, future processors may include 
a DMA controller, a timer or an A/D converter. 

Software written for the 80960KA will run without 
modification on any other member of the 80960 fam- 
ily. It is also pin-compatible with the 80960KB, which 
includes an integrated floating-point unit, and the 
80960MC, a military-grade version with support for 
multitasking, memory management, multiprocessing^ 
and fault tolerance. 
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1. Register g15 is reserved for stack management functions. 

2. Registers rO, r1 , and r2 are reserved for stack management functions. 



Figure 2. Register Set 
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KEY PERFORMANCE FEATURES 

The 80960 KA’s architecture is based on the most 
recent advances in RISC technology and is ground- 
ed in Intel’s long experience in designing embedded 
controllers. Many features contribute to the 
80960KA’s exceptional performance: 

1. Large Register Set. Modern compilers can take 
advantage of a large number of registers to optimize 
execution speed. For maximum flexibility, the 
80960KA provides 32 32-bit registers and four 80-bit 
floating-point registers. (See Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of Instructions in most programs, 


so that execution speed can be greatly Improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions.) 

3. Load/Store Architecture. One way to improve 
execution speed is to reduce the number of times 
that the processor must access memory to perform 
an operation. Like other processors based on RISC 
technology, the 80960KA has a Load/Store archi- 
tecture, only the LOAD and STORE instructions ref- 
erence memory; all other instructions operate on 
registers. 
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Figure 3. Instruction Formats 
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Table 1. 80960KA Instruction Set 
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4. Simple Instruction Formats. Ail instructions in 
the 80960KA are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possi- 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these Instructions can overlap the 
load. The 80960KA manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated instruc- 
tions can be executed while the conditional instruc- 
tion Is pending. 

6. Integer Execution Optimization. When the re- 
sult of an operation Is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960KA gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the Instruction cache matches the maximum burst 
size for instruction fetches. The 80960KA automati- 
cally fetches four words In a burst and stores them 
directly in the cache. Due to the size of the cache 
and the fact that it Is continually filled in anticipation 
of needed instructions in the program flow, the 
80960KA Is exceptionally insensitive to memory wait 
states. In fact, each wait state causes only a 7% 
degradation In system perfomance. The benefit Is 
that the 80960KA will deliver outstanding perform- 
ance even with a low cost memory system. 

8. Cache Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the Instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 


Memory Space and Addressing Modes 

The 80960KA offers a linear programming environ- 
ment so that all programs running on the processor 
are contained In a single address space. The maxi- 
mum size of the address space is 4 Gigabytes (232 
bytes). 

For ease of use, the 80960KA has a small number of 
addressing modes, but Includes all those necessary 


to ensure efficient compiler implementations of high- 
level languages such as C, Fortran and Ada. Table 2 
lists the memory addressing modes. 


Data Types 

The 80960KA recognizes the following data types: 

Numeric: 

• 8-, 16-, 32- and 64-bit ordinals 

• 8-, 1 6, 32- and 64-bit integers 

Non-Numeric: 

• Bit 

• Bit Field 

• Triple-Word (96 bits) 

• Quad-Word (1 28 bits) 


Large Register Set 

The programming environment of the 80960KA in- 
cludes a large number of registers. In fact, 32 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 

There are two types of general-purpose registers: 
local and global. The global registers consist of six- 
teen 32-blt registers (GO through G15) These regis- 
ters perform the same function as the general-pur- 
pose registers provided in other popular microproc- 
essors. The term global refers to the fact that these 
registers retain their contents across procedure 
calls. 

The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960KA 
allocates 16 local registers (RO through R15). Each 
local register is 32 bits wide. 


Multiple Register Sets 

To further Increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident In memory. 

Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
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Table 2. Memory Addressing Modes 


• 1 2-Bit Offset 

• 32-Bit Offset 

• Register-Indirect 

• Register + 1 2-Bit Offset 

• Register + 32-Bit Offset 

• Register + (Index-Register x Scale-Factor) 

• Register x Scale Factor + 32-Bit Displacement 

• Register + (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor is 1 , 2, 4, 8 or 1 6 


a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 
oldest set of local registers In the register cache to a 


procedure stack In memory to make room for a new 
set of registers. Global register G15 Is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global registers are not exchanged on 
a procedure call, but retain their contents, making 
them available to all procedures for fast parameter 
passing. An Illustration of the register cache is 
shown In Figure 4. 
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Instruction Cache 

To further reduce memory accesses, the 80960KA 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions In a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 

To load the instruction cache, instructions are 
fetched in 16-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when it is 
needed. 

Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain In the cache, so It 
will be there on the procedure’s return. 


Register Scoreboarding 

The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 


do Instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction Is executed to move a variable from memo- 
ry into a register. When the instruction Is initiated, a 
scoreboard bit on the target register Is set. When the 
register is actually loaded, the bit Is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example: 

' LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated Instruction 
ADD R4, R5, R6 

In essence, the two unrelated Instructions between 
the LOAD and ADD instructions are executed for 
free (i.e., take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three instructions can be pending at ohe time 
with three corresponding scoreboard bits set. By ex- 
ploiting this feature, system programmers and com- 
pilers have a useful tool for optimizing execution 
speed. 
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High Bandwidth Local Bus 


Debug Features 


An 80960KA CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to interrupts. Its features include: 

• 32-blt multiplexed address/data path 

• Four-word burst capability, which allows transfers 
from 1 to 1 6 bytes at a time 

• High bandwidth reads and writes at 66.7 Mbytes 
per second 

• Special signal to indicate whether a memory 
transaction can be cached 


The 80960KA has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
internal 32-blt registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 

The 80960KA has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the Instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine Is called automatically. 


Figure 5 identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the interrupt lines. 


Interrupt Handling 

The 80960KA can be interrupted in one of two ways: 
by the activation of one of four interrupt pins or by 
sending a message on the processor’s data bus. 

The 80960KA is unusual in that It automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide 8259A handshaking for expansion beyond four 
interrupt lines. 


The 80960 KA also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 



Tracing Is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special debug instruction. In each case, the 
80960KA executes the instruction first and then 
calls a trace handling routine (usually part of a s'oft- 
ware debug monitor). Further program execution Is 
halted until the trace routine is completed. When the 
trace event handling routine Is completed, instruc- 
tion execution resumes at the next instruction. The 
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Figure 5. Local Bus Signal Groups 
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80960KA*s tracing mechanisms, which are imple- 
mented completely in hardware, greatly simplify the 
task of testing and debugging software. 


FAULT DETECTION 

The 80960KA has an automatic mechanism to 
handle faults. There are ten fault types including 
trace, arithmetic, and floating-point faults. When the 
processor detects a fault, it automatically calls the 
appropriate fault handling routine and saves the cur- 
rent instruction pointer and necessary state informa- 
tion to make efficient recovery possible. The proces- 
sor posts diagnostic information on the type of fault 
to a Fault Record. Like interrupt handling routines, 
fault handling routines are usually written to meet 
the needs of a specific application and are often in- 
cluded as part of the operating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 


BUILT-IN TESTABILITY 

Upon reset, the 80960KA automatically conducts an 
extensive internal test (self-test) of its major blocks 


of logic. Then, before executing its first instruction. It 
does a zero check, sum on the first eight words in 
memory to ensure that the system has been loaded 
correctly. If a problem is discovered at any point dur- 
ing the self-test, the 80960KA will assert its FAIL- 
URE pin and will not begin program execution. The 
self-test takes approximately 47,000 cycles to com- 
plete. 

System manufacturers can use the 80960KA’s self- 
test feature during Incoming parts inspection. No 
special diagnostic programs need to be written, and 
the test is both thorough and fast. The self-test ca- 
pability helps ensure that defective parts will be dis- 
covered before systems are shipped, and once In 
the field, the self-test makes it easier to distinguish 
between problems caused by processor failure and 
problems resulting from other causes. 


CHMOS 

The 80960KA is fabricated using Intel’s CHMOS IV 
(Complementary High Speed Metal Oxide Semicon- 
ductor) process. This advanced technology elimi- 
nates the frequency and reliability limitations of older 
CMOS processes and opens a new era in micro- 
processor performance. It combines the high per- 
formance capabilities of Intel’s industry-leading 
HMOS technology with the high density and low 
power characteristics of CMOS. The 80960 KA is 
available at 10, 16, 20 and 25 MHz. 


Table 4a. 80960KA Pin Description: L-Bus Signais 


Symbol 

Type 

Name and Function 

CLK2 

I 

SYSTEM CLOCK provides the fundamental timing for 80960KA systems. It is 
divided by two inside the 80960KA to generate the internal processor clock. 

LAD 31 

-LADo 

I/O 

T.S. 

LOCAL ADDRESS/DATA BUS carries 32-bit physical addresses and data to and 
from memory. During an address (Tg) cycle, bits 2-31 contain a physical word 
address (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 
contain read or write data. The LAD lines are active HIGH and float to a high 
impedance state when not active. 

SiZE, which is comprised of bits 0-1 of the LAD lines during a Tq cycle, specifies 
the size of a burst transfer in words. 

LADi LAPq 

0 0 IWord 

0 1 2 Words 

1 0 3 Words 

1 1 4 Words 


0 

T.S. 

ADDRESS-LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a Tg cycle and deasserted before the beginning of the Td state. It 

Is active LOW and floats to a high impedance state during a hold cycle (Th or T^r). 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KA Pin Description: L-Bus Signals (Continued) 


Symbol 

Type 

Name and Function 

ADS 

0 

O.D. 

ADDRESS/DATA STATUS indicates an address state. ADS is asserted every Tg 
state and deasserted during the following Td state. For a burst transaction, ADS is 
asserted again every Td state where READY was asserted in the previous cycle. 

W/R 

0 

O.D. 

WRITE/READ specifies, during a Ta cycle, whether the operation is a write or 
read. It is latched on-chip and remains valid during Td cycles. 

DT/R 

0 

O.D. 

DATA TRANSMIT/RECEIVE indicates the direction of data transfer to and from . 
the L-Bus. It is low during Ta and Td cycles for a read or interrupt 
acknowledgement; it is high during Ta and Td cycles for a write. DT/R never 
changes state when DEN is asserted (see Timing Diagrams). 

DEN 

0 

O.D. 

DATA ENABLE is asserted during Td cycles and indicates transfer of data on the 
LAD bus lines. 

READY 

1 

READY indicates that data on LAD lines can be sampled or removed. If READY is 
not asserted during a Td cycle, the Td cycle is extended to the next cycle by 
inserting a wait state (Tyy), and ADS is not asserted in the next cycle. 

LOCK 

I/O 

O.D. 

BUS LOCK prevents other bus masters from gaining control of the L-Bus 
following the current cycle (If they would assert LOCK to do so). LOCK Is used by 
the processor or any bus agent when it performs indivisible Read/Modify/WrIte 
(RMW) operations. Do not leave LOCK unconnected. It must be pulled high for the 
processor to function properly. 

For a read that is designated as a RMW-read, LOCK is examined, if asserted, the 
processor waits until It is not asserted; if not asserted, the processor asserts 

LOCK during the Ta cycle and leaves it asserted. 

A write that is designated as an RMW-wrIte deasserts LOCK in the Ta cycle. 

During the time LOCK Is asserted, a bus agent can perform a normal read or write 
but no RMW operations. LOCK is also held asserted during an interrupt- 
acknowledge transaction. 

BEi-BE^ 

0 

O.D. 

BYTE ENABLE LINES specify which data bytes (up to four) on the bus take part 
in the current bus cycle. BE3 corresponds to LAD31 ~LAD24 and BEq corresponds 
to LAD7 — LADq. 

The byte enables are provided In advance of data. The byte enables asserted 
during Tg specify the bytes of the first data word. The byte enables asserted 
during Td specify the bytes of the next data word (If any), that is, the word to be 
transmitted following the next assertion of READY. The byte enables during the 

Td cycles preceding the last assertion of READY are undefined. The byte enables 
are latched on-chip and remain constant from one Td cycle to the next when 

READY is not asserted. 

For reads, the byte enables specify the byte(s) that the processor will actually use. 
L-Bus agents are required to assert only adjacent byte enables (e.g., asserting just 
BEq and BEa is not permitted), and are required to assert at least one byte enable. 
To produce address bits Aq and A-j externally, they can be decoded from the byte 
enables. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KA Pin Description: L-Bus Signals (Continued) 


Symbol 

Type 

Name and Function 

HOLD/ 

HLDAR 

1 

HOLD: If the processor is the primary bus master (PBM), the input Is interpreted 
as HOLD, a request from a secondary bus master to acquire the bus. When the 
processor receives HOLD and grants another master control of the bus, it floats 
its tri-state bus lines and then asserts HLDA and enters the Th state. When HOLD 
is deasserted, the processor will deassert HLDA and go to either the Tj or Ta 
state. 

HOLD ACKNOWLEDGE RECEIVED: If the processor is a secondary bus master 
(SBM), the input Is HLDAR, which indicates, when HOLDR output Is high, that the 
processor has acquired the bus. Processors and other agents can be told at reset 
if they are the primary bus master (PBM). 

HLDA/ 

HOLDR 

0 

T.S. 

HOLD ACKNOWLEDGE: If the processor Is a primary bus master, the output is 
HLDA, which relinquishes control of the bus to another bus master. 

HOLD REQUEST: For secondary bus masters (SBM), the output is HOLDR, which 

Is a request to acquire the bus. The bus Is said to be acquired if the agent is a 
primary bus master and does not have its HLDA output asserted, or if the agent is 
a secondary bus master and has its HOLD input and HLDA output asserted. 

CACHE 

0 

T.S. 

CACHE indicates if an access is cacheable during a Ta cycle. It is not asserted 
during any synchronous access, such as a synchronous load or move instruction 
used for sending an lAC message. The CACHE signal floats to a high impedance 
state when the processor is idle. 


Table 4b. 80960KA Pin Description: Moduie Support Signals 


Symbol 

Type 

Name and Function 

BADAC 

1 

BAD ACCESS, if asserted in the cycle following the one In which the last READY 
of a transaction is asserted, indicates that an unrecoverable error has occurred on 
the current bus transaction, or that a synchronous load/store instruction has not 
been acknowledged. 

STARTUP: During system reset, the BADAC signal is interpreted differently. If the 
signal is high, it indicates that this processor will perform system initialization. If it 
is low, another processor In the system will perform system initialization instead. 

RESET 

1 

RESET clears the Internal logic of the processor and causes It to re-lnitlalize. 

During RESET assertion, the input pins are ignored (except for BADAC and 
IAC/INTq), the tri-state output pins are placed In a high Impedance state, and 
other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable RESET. 

The HIGH to LOW transition of RESET should occur after the rising edge of both 
CLK2 and the external bus CLK, and before the next rising edge of CLK2. 

FAILURE 

0 

O.D. 

INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. After RESET is deasserted and before the first bus transaction begins, 
FAILURE is asserted while the processor performs a self-test. If the self-test 
completes successfully, then FAILURE is deasserted. Next, the processor 
performs a zero checksum on the first eight words of memory. If It fails, FAILURE 
is asserted for a second time and remains asserted; if it passes, system 
initialization continues and FAILURE remains deasserted. 

N.C. 

N/A 

NOT CONNECTED indicates pins should not be connected. Never connect any 
pin marked N.C. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4b. 80960KA Pin Description: Module Support Signals (Continued) 


Symbol 

Type 

Name and Function 

lAC 

INTO 

1 

INTERAGENT COMMUNICATION REQUEST/INTERRUPT 0 indicates either 
that there Is a pending lAC message for the processor or an interrupt. The bus 
interrupt control register determines in which way the signal should be interpreted. 

To signal an interrupt or lAC request in a synchronous system, this pin (as well as 
the other interrupt pins) must be enabled by being deasserted for at least one bus 
cycle and then asserted for at least one additional bus cycle; in an asynchronous 
system, the pin must remain deasserted for at least two bus cycles and then be 
asserted for at least two more bus cycles. 

LOCAL PROCESSOR NUMBER: This signal is interpreted differently during 
system reset. If the signal is at a high voltage level, it indicates that this processor 
is a primary bus master (Local Processor Number = 0); If It Is at a low voltage 
level, it indicates that this processor Is a secondary bus master (Local Processor 
Number = 1). 

INTI 

1 

INTERRUPT 1, like INTO, provides direct interrupt signaling. 

INT2/ 

INTR 

1 

INTERRUPT 2/INTERRUPT REQUEST: The bus control registers determines 
how this pin is interpreted. If INT2, it has the same interpretation as the INTO and 

INT 1 pins. If INTR, It is used to receive an interrupt request from an external 

Interrupt controller. 

WT3/ 

INTA 

I/O 
. O.D. 

INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The bus interrupt control register 
determines how this pin is Interpreted. If INT3, It has the same interpretation as 
the INTO, INTI , and INT2 pins. If INTA, It Is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles; as an output, it is open-drain. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 


ELECTRICAL SPECIFICATIONS 


Power and Grounding 

The 80960KA is implemented in CHMOS IV technol- 
ogy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error, and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 12 Vcc 
and 13 Vss pins separately feed functional units of 
the 80960KA in the PGA. 

Power and ground connections must be made to all 
power and ground pins of the 80960KA. On the cir- 
cuit board, all Vcc pins must be strapped closely 
together, preferably on a power plane. Likewise, all 
Vss pins should be strapped together, preferably on 
a ground plane. These pins may not be connected 
together within the chip. 


Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960KA. The processor can cause tran- 
sient power surges when driving the L-Bus, particu- 
larly when It is connected to a large capacitive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and de- 
coupling capacitors as much as possible. Capacitors 
specifically designed for PGA packages are also 
commercially available and offer the lowest possible 
inductance. 


Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 
one or more interrupt lines are not used, they should 
be pulled up. No inputs should ever be left floating. 
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All open-drain outputs require a pullup device. While Figure 10 shows the typical capacitive derating 

in some cases a simple pullup resistor will be ade- curve for the 80960KA measured from 1 .5V on the 

quate, we recommend a network of pullup and pull- system clock (CLK) to 1 .5V on the falling edge and 

down resistors biased to a valid Vjh (>3.4V) and 1.5V on the rising edge of the L-Bus address/data 

terminated in the characteristic Impedance of the cir- (LAD) signals, 

cult board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 

drive network, which assumes that the circuit board Test Load Circuit 
has a characteristic Impedance of 1 0Ofl. The advan- 
tage of terminating the output signals in this fashion Figure 13 illustrates the load circuit used to test the 

is that it limits signal swing and reduces AC power 80960KA’s tristate pins, and Figure 14 shows the 

consumption. load circuit used to test the open drain outputs. The 

open drain test uses an active load circuit in the form 
of a matched diode bridge. Since the open-drain out- 
Characteristic Curves puts sink current, only the loL legs of the bridge are 

necessary and the Iqh legs are not used. When the 
Figure 7 shows the typical supply current require- 80960KA driver under test is turned off, the output 

ments over the operating temperature range of the pin is pulled up to Vref (i-e., Vqh)- Diode Di is 

processor at supply voltage (Vcc) of 5V. Figure 8 turned off and the Iql current source flows through 

shows the typical power supply current (Ice) re- diode D 2 . 

quired by the 80960KA at various operating frequen- 
cies when measured at three input voltage (Vcc) When the 80960KA open-drain driver under test is 

levels. on, diode Di is also on, and the voltage on the pin 

being tested drops to Vql- Diode D 2 turns off and 
For a given output current (Iql). th® curve In Figure 9 Iql ^Iows through diode D-i. 
shows the worst case output low voltage (Vql)- 



^CC 



0 

0 

> 

0- 

80960KA 

> 180X1 


80960KA 

S 130X1 

OUTPUT 

> 390X1 


OUTPUT 

> 280X1 

Low Drive Network; 

• VoH = 3.42V 

• Iql = 25.3 mA 

270775-4 

High Drive Network: 

• VoH = 3.41V 

• Iql ~ 33.8 mA 

270775-5 


Figure 6. Connection Recommendations for Low and High Current Drive Networks 
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Figure 9. Worst Case Voltage vs 
Output Current on Open-Drain Pins 


Figure 10. Capacitive Derating Curve 
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ABSOLUTE MAXIMUM RATINGS* 

Operating Temperature 0°C to + 85°C Case 

Storage Temperature —65°C to + 1 SO^C 

Voltage on Any Pin -0.5V to Vcc + 0-5V 

Power Dissipation 2.5W (25 MHz) 


NOTICE: This Is a production data sheet. The specifi- 
cations are subject to change without notice. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


DC CHARACTERISTICS 

PGA: 

80960KA (16 MHz): Tcase = 0"C to +85‘’C, Vcc = 5V ±10% 
80960KA (20 and 25 MHz): Tcase = to + 85°C, Vcc = 5V ±5% 


PQFP: 

80960KA (10 and 16 MHz): Tcase = to +100°C, Vcc = 5V ±10% 
80960KA (20 MHz): Tcase = O^C to +100°C, Vcc = 5V ±5% 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

V|L 

Input Low Voltage 

-0.3 

+ 0.8 

V 


V|H 

Input High Voltage 

2.0 

Vcc + 0.3 

V 


VCL 

CLK2 Input Low Voltage 

-0.3 

+ 0.8 

V 


VCH 

CLK2 Input High Voltage 

0.55 Vcc 

Vcc + 0.3 

, V 


VoL 

Output Low Voltage 


0.45 

V 

(1,5) 

VoH 

Output High Voltage 

2.4 


V 

(2, 4) 

•cc 

Power Supply Current: 






10 MHz 


300 

mA 



16 MHz 


375 

mA 



20 MHz 


420 

mA 



25 MHz 


480 

mA 


Ili 

Input Leakage Current 


±15 

juA 

0 ^ V|fs| ^ Vcc 

Ilo 

Output Leakage Current 


±15 

jliA 

0.45 ^ Vo ^ Vcc 

C|N 

Input Capacitance 


10 

PF 

fc = 1 MHz(3) 

Co 

I/O or Output Capacitance 


12 

PF 

fc = 1 MHz(3) 

CCLK 

Clock Capacitance 


10 

PF 

fc = 1 MHz(3) 


NOTES: 

1 . For tri-state outputs, this parameter is measured at: 

Address/Data 4.0 mA 

Controls 5.0 mA 

2. This parameter is measured at: 

Address/ Data - 1 .0 mA 

Controls -0.9 mA 

ALE , -5.0 mA 

3. Input, output, and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. For open-drain outputs 25 mA 
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AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960KA pins. All Input and output timings are spec- 
ified relative to the 1 .5V level of the rising edge. Four 
output timings, the specifications refer to the time it 
takes the signal to reach 1.5V. For input timings, 


the specifications refer to the time at which the sig- 
nal reaches (for input setup) or leaves (for hold time) 
the TTL levels of LOW (0.8V) or HIGH (2.0V). All AC 
testing should be done with input clock voltages of 
0.4V and 2.4V, except for the clock (CLK2), which 
should be tested with Input voltages of 0.45 Vcc and 
0.55 Vcc. 
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Figure 12. Timing Relationship of L-Bus Signals 
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AC Specification Tables 

80960KA AC Characteristics (10 MHz, PQFP Only) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

50 

125 

ns 

V|N = 1.5V 

T 2 

Processor Clock 

Low Time (CLK 2 ) 

12 


ns 

V|L = 10% Point 
= 1.2V 

T3 

Processor Clock 

High Time (CLK2) 

12 


ns 

V|H = 90% Point 
= 0 . 1 V + 0.5 Vcc 

T4 

Processor Clock 

Fall Time (CLK 2 ) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

25 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF {Controls)(2) 

T 6 H 

HOLDA Output 

Valid Delay 

4 

31 

ns 

Cl = 75 pF 

T7 

ALE Width 

25 


ns 

Cl = 75 pF 

Ts 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 75 pF(2) 

T 9 

Output Float 

Delay 

2 

20 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

T 9 H 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 75 pF 

T10 

Input Setup 1 

3 


ns 


T11 

Input Hold 

5 


ns 


T 1 IH 

HOLD Input 

Hold 

4 


ns 


Ti2 

Input Setup 2 

8 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti4 

Hold after ALE 

Inactive 

8 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti 5 

Reset Hold 

3 


ns 


Tie 

Reset Setup 

5 


ns 


Ti 7 

Reset Width 

1640 


ns 

41 CLK2 Periods Minimum 


NOTES: ' 

1 . lAC/INTo, INTi, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 
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80960KA AC Characteristics (16 MHz) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

31.25 

125 

ns 

V|N = 1.5V 

T 2 

Processor Clock 

Low Time (CLK2) 

8 


ns 

V|L = 10% Point 
= 1.2V 

T 3 

Processor Clock 

High Time (CLK2) 

8 


ns 

V|H = 90% Point 
= 0.1V + 0.5Vcc 

T 4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

25 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

TsH 

HOLDA Output 

Valid Delay 

4 

31 

ns 

Cl = 75 pF 

T 7 

ALE Width 

15 


ns 

Cl = 75 pF 

Ts 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 75 pF(2) 

T 9 

Output Float 

Delay 

2 



Cl = lOOpF(LAD) 

Cl = 75 pF (Controis){2) 

Tqh 

HOLDA Output 

Float Delay 

4 


HI 

Cl = 75 pF 

T 10 

Input Setup 1 

3 


ns 


T 11 

Input Hold 

5 


ns 


T 11 H 

HOLD Input 

Hold 

4 


ns 


Ti2 

Input Setup 2 

8 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = 100 pF (LAD) 

Cl = 75 pF (Controls) 

Ti4 

Hold after 

Inactive 

8 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Tie 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

1281 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. lAC/INTo, INTi, INT2/INTR, INT3 can be asynchronous. 

2 . A float condition occurs when the maximum output current becomes less than Ilq- Float delay is not tested, but should be 
no longer than the valid delay. 

3 . Clock rise and fall time is not tested. 
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80960KA AC Characteristics (20 MHz) 


Symboi 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

25 

125 

ns 

V|N = 1.5V 

T 2 

Processor Clock 

Low Time (CLK2) 

6 


ns 

V|L = 10% Point 
= 1.2V 

T 3 

Processor Clock 

High Time (CLK2) 

6 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T 4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

20 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Tsh 

HOLDA Output 

Valid Delay 

4 

26 

ns 

Cl = 50 pF 

Ty 

Ml Width 

12 


ns 

Cl = 50 pF 

Te 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 50 pF(2) 

Tg 

Output Float 

Delay 

2 

20 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls)(2) 

Tgn 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 50 pF 

T 10 

Input Setup 1 

3 


ns 


T 11 

Input Hold 

5 


ns 


T 11 H 

HOLD Input 

Hold 

4 


ns 


Ti2 

Input Setup 2 

7 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti4 

Hold after Me 

Inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Tie 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

1025 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1 . lAC/INTo, INTi, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 


80960KA 

TRISTATE OUTPUT 

Cl = 

2 ; 

7 

W75-12 




I 'OL 

A I 

80960KA 


r 0 

OPEN-DRAIN OUTPUT 


02 




i 

Cl =1= 





Iql Tested at 25 mA 

Vref = Vcc 

Di and P 2 are matched 270775-13 


Figure 14. Test Load Circuit for Open-Drain Output Pins 


Figure 13. Test Load Circuit for 
Tri-State Output Pins 
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80960KA AC Characteristics (25 MHz, PGA Only) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

20 

125 

ns 

V|N = 1.5V 

T2 

Processor Clock 

Low Time (CLK2) 

5 


ns 

V|L = 10% Point 
= 1.2V 

Ta 

Processor Clock 

High Time 

5 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T4 . 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

18 

ns 

Cl = 60 pF(LAD) 

Cl = 50 pF (Controls) 

Tsh 

HOLDA Output 

Valid Delay 

4 

24 

ns 

Cl = 50 pF 

Ty 

ALE Width 

12 


ns 

Cl = 50 pF 

Te 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 50 pF(2) 

Tg 

Output Float 
, Delay 

2 

18 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

TgH 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 50 pF 

Tio 

Input Setup 1 

3 


ns 


Til 

Input Hold 

5 


ns 


TiiH 

HOLD Input 

Hold 

4 


ns 


Ti2 

Input Setup 2 

7 


ns 


Ti3 

Setup to ALE 

Inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti4 

Hold after ALE 
inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Tie 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

820 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. lAC/INTO, INT1, INT2/INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilq- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 
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Figure 16. RESET Signal Timing 
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Figure 17. Hold Timing 


Desian Considerations When designing an 80960KA hardware system that 

^ uses the ICE-960KB to debug the system, several 

Input hold times can be disregarded by the designer electrical and mechanical characteristics should be 

whenever the Input Is removed because a subse- considered. These considerations include capacitive 

quen t output from the processor is deasserted (e.g., loading, drive requirement, power requirement and 
DEN becomes deasserted). physical layout. 

Whenever the processor generates an output that The ICE-960KB probe module increases the load 
Indicates a transition into a subsequent state, any capacitance of each line by up to 25 pF. It also adds 

outputs that are specified to be tri-stated in this new one standard Schottky TTL load on the CLK2 line, 

state are guaranteed to be tri-stated. For example, in up to one advanced low-power Schottky TTL load 
the Td cycle following a Ta cycle for a read, the mini- for each control signal line, and one advanced low- 

mum output delay of DEN is 2 ns, but th e maximum power Schottky TTL load for each address/data and 

float time of LAD is 20 ns. When DEN is asserted. byte enable line. These loads originate from the 

however, the LAD outputs are guaranteed to have probe module and are driven by the 80960KA proc- 
been tri-stated. essor. 

To achieve high noise immunity, the ICE-960KB 
Designing for the ICE-960KB 'S powered by the user’s system. The high- 

speed probe circuitry draws up to 1 .1 A plus the max- 
The 80960KB In-Circuit Emulator assists in debug- imum current (Ice) oi the 80960KA processor, 
ging both 80960KA and 80960KB hardware and 

software designs. The product consists of a probe The mechanical considerations are shown in Figure 

module, cable, and control unit. Because of the high 18, which Illustrates the lateral clearance require- 

operating frequency of 80960KA systems, the probe ments for the ICE-960KB probe as viewed from 

module connects directly to the 80960KA socket. above the socket of the 80960KA processor. 
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Figure 18. iCE-960KB Lateral Clearance Requirements 


MECHANICAL DATA 


Package Dimensions and Mounting 

The 80960KA is available In two different packages: 
a 132-lead ceramic pin-grid array (PGA) and a 132- 
lead plastic quad flat pack (PQFP). Pins in the ce- 
ramic package are arranged 0.100 Inch (2.54 mm) 
center-to-center, in a 14 by 14 matrix, three rows 
around. (See Figure 19.) The plastic package uses 
fine-pitch gull wing leads arranged in a single row 
along the perimeter of the package with 0.025 inch 
(0.64 mm) spacing. (See Figure 20.) Dimensions are 
given in Figure 21 and Table 7. 

There are a wide variety of sockets available for the 
ceramic PGA package including low-insertion or 
zero-insertion force mountings, and a choice of ter- 
minals such as soldertail, surface mount, or wire 
wrap. Several applicable sockets are shown in Fig- 
ure 22. 

The PQFP is normally surface mounted to take best 
advantage of the plastic (Dackage’s small footprint 
and low cost. In some applications, however, de- 
signers may prefer to use a socket, either to improve 


heat dissipation or reduce repair costs. Figures 23a 
and 23b show two of the many sockets available. 


Pin Assignment 

The PGA and PQFP have different pin assignments. 
Figure 24 shows the view from the bottom of the 
PGA (pins facing up) and F^igure 25 shows a view 
from the top of the PGA (pins facing down). Figures 
20 and 32 show the top view of the PQFP; notice 
that the pins are numbered in order from 1 to 1 32 
around the package’s perimeter. Tables 5 and 6 list 
the function of each pin in the PGA, and Tables 8 
and 9 list the function of each pin In the PQFP. 

Vcc and GND connections must be made to multi- 
ple Vcc and GND pins. Each Vcc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. We 
recommend that you include separate power and 
ground planes In your circuit board for power distri- 
bution. 

NOTE: 

Pins Identified as N.C., “No Connect,” should never 
be connected. 
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Package Thermal Specification 

The 80960 KA is specified for operation when case 
temperature is within the range 0°C to + 85“C (PGA) 
or +100°C (PQFP). The case temperature should 
be measured at the top center of the package as 
shown in Figure 26. 

The ambient temperature can be calculated from 0-^c 
and 0ja by using the following equations: 

Tj = Tc + P*0jc 

Ta = Tj - P^Oja 

Tc = Ta + P*[^ja ~ ^jcl 

Values for 0ja and 0jc are given in Table 10 for the 
PGA package and in Table 1 1 for the PQFP for vari- 
ous airflows. Note that the 8ja for the PGA package 
can be reduced by adding a heatsink, while a heat- 
sink is not generally used with the plastic package 
since it is intended to be surface mounted. The max- 
imum allowable ambient temperature (Ta) permitted 
without exceeding Tc is shown by the charts in Fig- 
ures 27 through 30 for 10 MHz, 16 MHz, 20 MHz, 
and 25 MHz respectively. 

The curves assume the maximum permitted supply 
current (Ice) at each speed, Vec of 5.0V, and a 
Tcase of +85‘’C (PGA) or + 100*^0 (PQFP). 

If you will be using the 80960KA in a harsh environ- 
ment where the ambient temperature may exceed 
the limits for the normal commercial part, you should 
consider using an extended temperature part. These 
parts are designed by the prefix “TA” and are avail- 
able at 16 MHz, 20 MHz and 25 MHz In the ceramic 
PGA package. The extended operating temperature 
range is -40°C to +125°C case. Figure 30 shows 
the maximum allowabla ambient temperature for the 
20 MHz extended temperature TA80960KA at vari- 
ous airflows. The curve assumes an Ice of 420 mA, 
Vec of 5.0V, and a Tcase of + 125^0. 


WAVEFORMS 

Figures 33 through 38 show the waveforms for vari- 
ous transactions on the 80960KA’s local bus. 


SUPPORT COMPONENTS 


85C960 Burst Bus Controller 

The Intel 85C960 performs burst logic, ready gener- 
ation, and address decode for the 80960KA and 
80960KB. The burst logic supports both standard 
and burst mode memories and peripherals. The 
ready generation and timing control supports 0 to 1 5 
wait states across eight address ranges for read/ 
write and burst accesses. The address decoder de- 
codes eight address Inputs Into four external and 
four internal chip selects. The wait state and chip 
select values may be programmed by the user; the 
timing control and burst logic are fixed. 

The 85C960 operates with the 80960KA and 
80960KB at all frequencies and consumes only 
50 mA at 25 MHz. The 85C960 is housed in a 28-pin, 
300-mil ceramic DIP and plastic DIP packages or 28- 
pln PLCC package for surface mount. In the ceramic 
DIP package the part is UV-erasable, which makes it 
easy to revise designs. Order the 85C960 data sheet 
(No. 290192) for full details. 


27960KX Burst Mode EPROM 

Intel 27960KX one-megabit EPROM is designed 
specifically to support the 80960KA and 80960KB. It 
uses a burst interface to offer near zero wait-state 
performance without the high cost of alternative 
memory technologies. The 27960KX removes the 
need for “dumping” code and data stored in slow 
EPROMs or ROMs into expensive high-speed 
“shadow” RAM. 

Internally, the 27960KX is organized In blocks of four 
bytes that are accessed sequentially. The address 
of the four-byte block is latched and incremented 
Internally. After a set number of wait-states (1 or 2), 
data Is output one word at a time each subsequent 
clock cycle. High-performance outputs provide zero 
wait-state data-to-data burst accesses. Extra power 
and ground pins dedicated to the output reduce the 
effect of fast output switching on the device. The 
27960KX offers 1 -0-0-0 performance at 20 MHz and 
2-0-0-0 performance at 25 MHz. Full details can be 
found In the 27960KX data sheet (No. 290237) 
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Figure 19. A 132-Lead Pin-Grid Array (PGA) Used to Package the 80960KA 
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Figure 20. The 132-Lead Plastic Quad Flat Pack (PQFP) used to Package the 80960KA 








Figure 21b. Details of the Molding of the 132-Lead PQFP 
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N 

Leadcount 

A 

Package Height 

A1 

Standoff 

D,E 

Terminal Dimension 

D1.E1 

Package Body 

D2.E2 

Bumper Distance 

Without Flash 

With Flash 

D3,E3 

Lead Dimension 

D4,E4 

Foot Radius Location 

LI 

Foot Length 


27.860 

27.860 


0.800 REF 


20.32 REF 


25.890 

0.510 
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• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in 
insertion force compared to 
machined sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in 
version 55573-2 

Amp Incorporated 
(Harrisburg. PA 17105 U.S.A. 

Phone 717-564-0100) 


55274=1 




Cam handle locks in low profile position when 80960KA is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 


Peel-A-Way* Mylar and Kapton 
Socket Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (thee-level) 

• Low insertion force press-fit 
CS132-05TG 

Advanced Interconnections 

(5 Division Street) 

Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 


Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MSI 32 

Molded Plastic Body KS132 
is shown below; 



270775-26 



270775-27 

Courtesy Advanced Interconnections 
(Peel-A-Way Terminal Carriers 
U.S. Patent No. 4442938) 


* Peel-A-Way is a trademark of Advanced Interconnections. 



Figure 22. Several Socket Options for Mounting the 80960KA 
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Figure 24. 80960KA PGA Pinout— View from Bottom (Pins Facing Up) 
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Figure 25. 80960KA PGA Pinout— View from Top (Pins Facing Down) 
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Table 5. 80960KA PGA Pinout — In Pin Order 


Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

A1 

Vcc 

C6 

LAD 20 

H1 

W/R 

M10 

Vss 

A2 

Vss 

C7 

LAD 13 

H2 

BEo 

loom 


A3 

LADi9 

C 8 

LADs 

H3 

LOCK 



A4 

LADi 7 

C9 

LAD 3 

H12 

N.C. 

M13 

N.C. 

A5 

LADi6 

C 10 

Vcc 

H13 

N.C. 

M14 

N.C. 

A 6 

LADi4 

C 11 

Vss 

H14 

N.C. 

N 1 

Vss 

A7 

LAD 11 

Cl 2 


J1 

DT/R 

N2 

N.C. 

A 8 

LADg 

C13 

INT 1 

J2 

BE 2 

N3 

N.C. 

A9 

LAD 7 

C14 

T^/MTo 

J3 

Vss 

N4 

N.C. 

A10 

LAD 5 

D 1 

Ml 

J12 

N.C. 

N5 

N.C. 

A11 

LAD 4 

D2 

ADS 

J13 

N.C. 

N 6 


A12 

LADi 

D3 


J14 

N.C. 

N7 


A13 

INT 2 /INTR 

D12 

Vcc 

K1 

BE 3 

N 8 


A14 

Vcc 

D13 

N.C. 

K2 

FAILURE 

N9 


B 1 

LAD 23 

D14 

N.C. 

K3 
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N10 


B2 
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E1 
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N11 


B3 
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N.C. 

N12 


B4 
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E3 

LAD 27 

K14 

N.C. 

N13 


B5 

LAD 18 

E12 

N.C. 

L1 

DEN 

N14 


B 6 

LADi 5 

E13 

Vss 

L2 

N.C. 

P 1 


B7 

LAD 12 

E14 

N.C. 

L3 

Vcc 

P2 


B 8 

LAD 1 O 

F1 

LAD 29 

L12 

Vss 

P3 


B9 

LADe 

F2 

LAD 31 

L13 

N.C. 

P4 


B10 

LAD 2 
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CACHE 

L14 

N.C. 

P5 


B11 

CLK2 

F12 

N.C. 

M1 

N.C. 

P 6 


B12 

LADo 

F13 

N.C. 

M2 

Vcc 

P7 


B13 

RESET 

F14 

N.C. 

M3 

Vss 

P 8 


B14 

Vss 

G 1 
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P9 
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HOLD/HLDAR 

G2 

READY 
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C2 
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P 11 
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N.C. 






3-68 





80960KA 


iny. 


Table 6. 80960KA PGA Pinout— -In Signal Order 


Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

ADS 

D2 

LADi 5 

B 6 

N.C. 

J14 

N.C. 

P9 

ALE 

D1 

LADi 6 

A5 

N.C. 

K13 

N.C. 

P10 

BADAC 

C3 

LADi 7 

A4 

N.C. 

K14 

N.C. 

P11 

BE^ 

H2 

LADi 8 

B5 

N.C. 

L13 

N.C. 

P12 

BET 

G3 

LAD 19 

A3 

N.C. 

L14 

N.C. 

L2 

CDI 

m 

N> 1 

J2 

LAD2O 

C 6 

N.C. 

M1 

READY 

G2 

BEi 

K1 

LAD2I 

B4 

N.C. 

M 6 

RESET 

B13 

CACHE 

F3 

LAD22 

B3 

N.C. 

M7 

Vcc 

A1 

CLK2 

B11 

LAD23 

B1 

N.C. 

M 8 

Vcc 

A14 

DEN 

L1 

LAD24 

B2 

N.C. 

M9 

Vcc 

C4 

DT/R 

J1 

LAD25 

C2 

N.C. 

M12 

Vcc 

C10 
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K2 
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E2 
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Figure 29. 20 MHz 80960 K-Series Maximum Allowable Ambient Temperature 



Figure 30. Maximum Allowable Ambient Temperature for 
the 80960KA at 25 MHz (available in PGA only) 



Figure 31. Maximum Allowable Ambient Temperature for the Extended 
Temperature TA-80960KA at 20 MHz (available In PGA only) 
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Figure 32, 80960KA PQFP Pinout— View from Top 
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Table 8. 80960KA Plastic Package Pinout — In Pin Order 


Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

1 

HLDA/HOLDR 

34 

N.C. 

67 

Vss 

100 

LADO 

2 

ALE 

35 

Vcc 

68 

Vss 

101 

LAD1 

3 

LAD26 

36 

Vcc 

69 

N.C. 

102 

LAD2 

4 

LAD27 

37 

N.C. 

70 

Vcc 

103 

Vss 

5 

LAD28 

38 

N.C. 

71 

Vcc 

104 

LAD3 

6 

LAD29 

39 

N.C. 

72 

N.C. 

105 

LAD4 

7 

LAD30 

40 

N.C. 

73 

Vss 


LAD5 

8 

LAD31 

41 

Vcc 

74 



LAD6 

9 

Vss 

42 

Vss 

75 



LAD7 

10 

CACHE 

43 

N.C. 

76 




11 

W/R 

44 

N.C. 

77 



LAD9 

12 

READY 

45 

N.C. 

78 

N.C. 

BB 

LAD10 

13 

DT/R 

46 

N.C. 

79 

Vss 

^^9 

LAD11 

14 

BEO 

47 

N.C. 

80 

Vss 

113 

LAD12 

15 

BET 

48 

N.C. 

81 

N.C. 

114 

Vss 

16 

BE2 

49 

N.C. 

82 

Vcc 

115 

LAD13 

17 

BE3 

50 

N.C. 

83 

Vcc 

116 

LAD14 

18 

FAILURE 

51 

N.C. 

84 

Vss 

117 

LAD15 

19 

Vss 

52 

Vss 

85 

IAC/TNTO 

118 

LAD16 

20 

LOCK 

53 

Vss 

86 

INTI 

119 

LAD17 

21 

1 

54 

N.C. 1 

87 

INT2/INTR 

120 

LAD18 

22 

Vss 

55 

Vcc 

88 

INT3/INTA 

121 

LAD19 

23 

Vss 

56 

Vcc 

89 

N.C. 

122 

LAD20 

24 

N.C. 

57 

Vss 

90 

Vss 

123 

LAD21 

25 

N.C. 

58 

N.C. 

91 

CLK2 

124 

LAD22 

26 

Vss 

59 

N.C. 

92 

Vcc 

125 

Vss 

27 

Vss 

60 

N.C. 

93 

RESET 

126 

LAD23 

28 

N.C. 

61 

N.C. 

94 

N.C. 

127 

LAD24 

29 

Vcc 

62 

N.C. 

95 

N.C. 

128 

LAD25 

30 

Vcc 

63 

N.C. 

96 

N.C. 

129 

BADAC 

31 

N.C. 

64 

N.C. 

97 

N.C. 

130 

HOLD/HLDAR 

32 

Vss 

65 

N.C. 

98 

N.C. 

131 

N.C. 

33 

Vss 

66 

N.C. 

99 

Vss 

132 

WS 
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Table 9. 80960KA Plastic Package Pinout^ln Signal Order 


Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

ADS 

132 

LAD22 

124 

N.C. 

49 

Vcc 

41 

ALE 

2 

LAD23 

126 

N.C. 

50 

Vcc 

55 

BADAC 

129 

LAD24 

127 

N.C. 

51 

Vcc 

56 

BEO 

14 

LAD25 

128 

N.C. 

54 

Vcc 

70 

BET 

15 

LAD26 

3 

N.C. 

58 

Vcc 

71 

BE2 

16 

LAD27 

4 

N.C. 

59 

Vcc 

74 

BE3 

17 

LAD28 

5 

N.C. 

60 

Vcc 

82 

CACHE 

10 

LAD29 

6 

N.C. 

61 

Vcc 

83 

CLK2 

91 

LAD3 

104 

N.C. 

62 

Vcc 

92 

DEN 

21 

LAD30 

7 

N.C. 

63 

Vss 

9 

DT/R 

13 

LAD31 

8 

N.C. 

64 

Vss 

19 

FAILURE 

18 

LAD4 

105 

N.C. 

65 

Vss 

22 

HLDA/HOLDR 

1 

LAD5 

106 

N.C. 

66 

Vss 

23 

HOLD/HLDAR 

130 

LAD6 

107 

N.C. 

69 

Vss 

26 


85 

LAD7 

108 

N.C. 

72 

Vss 

27 

INT1 

86 

LAD8 

109 

N.C. 

75 

Vss 

32 

INT2/INTR 

87 

LAD9 

110 

N.C. 

76 

Vss 

33 

INT3/INTA 

88 

LOCK 

20 

N.C. 

77 

Vss 

42 

LADO 

100 

N.C. 

24 

N.C. 

78 

Vss 

52 

LAD1 

101 

N.C. 

25 

N.C. 

81 

Vss 

53 

LAD10 

111 

N.C. 

28 

N.C. 

89 

Vss 

57 

LAD11 

112 

N.C. 

31 

N.C. 

94 

Vss 

67 

LAD12 

113 

N.C. 

34 

N.C. 

95 

Vss 

68 

LAD13 

115 

N.C. 

37 

N.C. 

96 

Vss 

73 

LAD14 

116 

N.C. 

38 

N.C. 

97 

Vss 

79 

LAD15 

117 

N.C. 

39 

N.C. 

98 

Vss 

80 

LAD16 

118 

N.C. 

40 

N.C. 

131 

Vss 

84 

LAD17 

119 

N.C. 

43 

READY 

12 

Vss 

90 

LAD18 

120 

N.C. 

44 

RESET 

93 

Vss 

99 

LAD19 

121 

N.C. 

45 

Vcc 

29 

Vss 

103 

LAD2 

102 

N.C. 

46 

< 

o 

O 

30 

Vss 

114 

LAD20 

122 

N.C. 

47 

< 

o 

o 

35 

Vss 

125 

LAD21 

123 

N.C. 

48 

Vcc 

36 

W/R 

11 
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Table 10. 80960KA PGA Package Thermal Characteristics 


Parameter 


Thermal Resistance— X/Watt 

Airflow — ft./mln (m/sec) 

0 I 50 I 100 I 200 400 600 800 

(0) (0.25) (0.50) (1.01) (2.03) (3.04) (4.06) 


d Junction-to-Case 

(Case Measured 2 2 2 2 2 2 2 

as shown in Figure 26) 


9 Case-to-Ambient 
(No Heatsink) 


$ Case-to-Ambient 

(with Omnidirectional 1615 14 12 9 7 6 

Heatsink) 

9 Case-to-Ambient 

(with Unidirectional) 15 14 13 11 8 6 5 

Heatsink) 

NOTES: 

1 . This table applies to 80960KA PGA 3. 0j.cap == 4°C/w (approx.) 

plugged into socket or soldered di- ^j-pin = 4°C/w (inner pins) (approx.) 
rectly into board. ^J-PIN = 8°C/w (outer pins) (approx.) 

2. ^JA = ^JC + 9cA‘ 



Table 11. 80960KA PQFP Package Thermal Characteristics 


PQFP Thermal Resistance— X/Watt 


Airflow— ft./min (m/sec) 


Parameter 


p 200 400 600 800 

(0) (0.25) (0.50) (1.01) (2.03) (3.04) (4.06) 

0 Junctlon-to-Case 

(Case Measured 9 9 9 9 9 9 9 

as shown In Figure 26) 


22 19 18 16 11 

(No Heatsink) 

NOTES: 

1. This table applies to 80960KA 3. 0 jl = 18X/Watt 

PQFP soldered directly into board. 0jB = 18X/Watt 

2. 0JA = ^JC + ^CA- 
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Figure 34. Write Transaction with One Wait State 
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Figure 36. Burst Write Transaction with One Wait State 
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NOTE: 

INTR can go low no sooner than 5 ns (input hold time) following the beginning of interrupt acknowledgement cycle 1 
For a second interrupt to be acknowledged, INTR must be low for at least three cycles before it can be reasserted. 

Figure 37. Interrupt Acknowledge Transaction 
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Figure 38. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 
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80960KB 

EMBEDDED 32-BIT PROCESSOR 
WITH INTEGRATED FLOATING-POINT UNIT 


■ High-Performance Embedded 
Architecture 

— 25 MIPS Burst Execution at 25 MHz 

— 9.4 MIPS* Sustained Execution at 
25 MHz 

■ On-Chip Floating-Point Unit 

— Supports IEEE 754 Standard 
— Four 80-Bit Registers 

— 5.2 Million Whetstones/s at 
25 MHz 

■ 512-Byte On-Chip Instruction Cache 
— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

■ 4 Gigabyte, Linear Address Space 

■ 132-Lead PGA and PQFP Packages 


■ Multiple Register Sets 

— Sixteen Global 32-Bit Registers 
— Sixteen Local 32-Bit Registers 
— Four Local Register Sets Stored 
On-Chip 

— Register Scoreboarding 

■ Built-In Interrupt Controller 

— 32 Priority Levels 256 Vectors 

— 3.4 ju,s Latency 

■ Easy to Use, High Bandwidth 32-Bit Bus 

— 66.7 Mbytes/s Burst 

— Up to 16-Bytes Transferred per Burst 

■ Uses 85C960 Bus Controller 

■ Supported by 27960KX Burst EPROMs 


The 80960KB is the first member of Intel’s new 32-bit processor family, the I960 series, which is designed 
especially for embedded applications. It is based on the family’s high performance, common core architecture, 
and Includes a 512-byte instruction cache, a built-in interrupt controller, and an integrated floating-point unit. 
The 80960KB has a large register set, multiple parallel execution units and a high-bandwidth, burst bus. Using 
advanced RISC technology, this high performance processor is capable of execution rates in excess of 9.4 
million instructions per second.* The 80960KB is well-suited for a wide range of embedded applications, 
including laser printers, Image processing, industrial control, robotics and telecommunications. 


*Relative to Digital Equipment Corporation’s VAX-11 /780** at 1 MIPS 



270565-1 


Figure 1. The 80960KB’s Highly Parallel Microarchitecture 


** VAX-1 1 TM is a trademark of Digital Equipment Corporation. 
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THE 960 SERIES 

The 80960KB is the first member of a new family of 
32"bit microprocessors from Intel known as the 960 
Series. This series was especially designed to serve 
the needs of embedded applications. The embed- 
ded market includes applications as diverse as In- 
dustrial automation, avionics, image processing, 
graphics, robotics, telecommunications and automo- 
biles. These types of applications require high 
integration, low power consumption, quick interrupt 
response times and high performance. Since time to 
market is critical, embedded microprocessors need 
to be easy to use in both hardware and software 
designs. 


All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor In the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications in the embedded 
market. For example, future processors may include 
a DMA controller, a timer or an A/D converter. 

The 80960KB includes an integrated floating-point 
unit. Intel also offers a pin-compatible version, called 
the 80960KA, without an FPU, and a military-grade 
version, the 80960MC, with support for memory 
management, mutitasking, multiprocessing and fault 
tolerance. 


go 


gi5 


SIXTEEN 

32-BIT 

REGISTERS 


GLOBAL 

REGISTERS^) 


fpO 

fp3 


FOUR 80-BIT REGISTERS 


FLOATING- 

POINT 

REGISTERS 



LOCAL 

REGISTERS(2) 

ARITHMETIC CONTROLS 

INSTRUCTION POINTER 

PROCESS CONTROLS 

TRACE CONTROLS 


0 


232-1 


ADDRESS 

SPACE 


NOTES: 

1. Register g15 is resen/ed for stack management functions. 

2. Registers rO, r1 , and r2 are reserved for stack management functions. 


Figure 2. Register Set 
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KEY PERFORMANCE FEATURES 

The 80960KB’s architecture is based on the most 
recent advances in RISC technology and is ground- 
ed in Intel’s long experience in designing embedded 
controllers. Many features contribute to the 
80960KB’s exceptional performance: 

1. Large Register Set. Having a large number of 
registers reduces the number of times that a proces- 
sor needs to access memory. Modern compilers can 
take advantage of this feature to optimize execution 
speed. For maximum flexibility, the 80960KB pro- 
vides 32 32-bit registers and four 80-bit floating- 
point registers. (See Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions in most programs, 


Control 


Compare 
and Branch 

Register 
to Register 

Memory 

Access-Short 

Memory 
Access— Long 


Opcode 


Opcode Reg/ Lit 


Opcode Reg 


Opcode Reg 


Opcode Reg 


Figure 3. 


so that execution speed can be greatly improved by 
ensuring that these core instructions execute In as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of In- 
structions.) 

3. Load/Store Architecture. Like other processors 
based on RISC technology, the 80960KB has a 
Load/Store architecture, only the LOAD and STORE 
Instructions reference memory; all other Instructions 
operate on registers. This type of architecture simpli- 
fies Instruction decoding and is used in combination 
with other techniques to increase parallelism. 


Displacement 


3 


Reg M Displacement 


Reg/Lit Modes Ext’dOp Reg/LIt 


Base M X Offset 


Base Mode Scale xx Index 


Displacement 
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Table 1. 80960KB Instruction Set 


Data Movement 

Arithmetic 

Logical 

Bit and Bit 

Field 

Load 

Store 

Move 

Load Address 

Add 

Subtract 

Multiply 

Divide 

Remainder 

Modulo 

Shift 

Extended Multiply 

Extended Divide 

And 

Not And 

And Not 

Or 

Exclusive Or 

Not Or 

Or Not 

Nor 

Exclusive Nor 

Not 

Nand 

Rotate 

Set Bit 

Clear Bit 

Not Bit 

Check Bit 

Alter Bit 

Scan for Bit 

Scan over Bit 

Extract 

Modify 

Comparison 

Branch 

Call/Return 

Fault 

Compare 

Conditional 

Compare 

Compare and 

Increment 

Compare and 

Decrement 

Unconditional 

Branch 

Conditional Branch 
Compare and 

Branch 

Call 

Call Extended 

Call System 

Return 

Branch and Link 

Conditional Fault 
Synchronize Faults 

Debug 

Miscellaneous 

Decimal 


Modify Trace 

Controls 

Mark 

Force Mark 

Atomic Add 

Atomic Modify 

Flush Local Registers 
Modify Arithmetic 

Controls 

Modify Process Controls 
Scan Byte for Equal 

Test Condition Code 

Move 

Add with Carry 
Subtract with Carry 


Conversion 

Floating-Point 

Synchronous 


Convert Real to Integer 
Convert Integer to Real 

Move Real 

Add 

Subtract 

Multiply 

Divide 

Remainder 

Scale 

Round 

Square Root 

Sine 

Cosine 

Tangent 

Arctangent 

Log 

Log Binary 

Log Natural 

Exponent 

Classify 

Copy Real Extended 
Compare 

Synchronous Load 
Synchronous Move 
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4. Simple Instruction Formats. All instructions in 
the 80960KB are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possi- 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960KB manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated Instruc- 
tions can be executed while the conditional instruc- 
tion is pending. 

6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960KB gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the Instruction cache matches the maximum burst 
size for instruction fetches. The 80960KB automati- 
cally fetches four words in a burst and stores them 
directly In the cache. Due to the size of the cache 
and the fact that it is continually filled in anticipation 
of needed instructions in the program flow, the 
80960KB is exceptionally insensitive to memory wait 
states. In fact, each wait state causes only a 7% 
degradation in system perfomance. The benefit is 
that the 80960KB will deliver outstanding perform- 
ance even with a low cost memory system. 

8. Cache Bypass. If there Is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 


Memory Space and Addressing Modes 

The 80960KB offers a linear programming environ- 
ment so that all programs running on the processor 
are contained in a single address space. The maxi- 
mum size of the address space is 4 Gigabytes (232 
bytes). 

For ease of use, the 80960KB has a small number of 
addressing modes, but includes all those necessary 


to ensure efficient compiler implementations of high- 
level languages such as C, Fortran and Ada. Table 2 
lists the memory addressing modes. 


Data Types 

The 80960KB recognizes the following data types: 

Numeric: 

• 8-, 16-, 32- and 64-bit ordinals 

• 8-, 16, 32- and 64-bit integers 

• 32-, 64- and 80-blt real numbers 

Non-Numeric: 

• Bit 

• Bit Field 

• Triple- Word (96 bits) 

• Quad-Word (128 bits) 


Large Register Set 

The programming environment of the 80960KB In- 
cludes a large number of registers. In fact, 36 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 

There are two types of general-purpose registers: 
local and global. The 20 global registers consist of 
sixteen 32-bit registers (GO through G15) and four 
80-bit registers (FPO through FP3). These registers 
perform the same function as the general-purpose 
registers provided in other popular microprocessors. 
The term global refers to the fact that these regis- 
ters retain their contents across procedure calls. 

The local registers, on the other hand, are proce- 
dure specific. For each procedure call, thb 80960KB 
allocates 16 local registers (RO through R15). E^ch 
local register is 32 bits wide. Any register can also 
be used for single or double-precision floating-point 
operations; the 80-bit floating-point registers are pro- 
vided for extended precision. 


Multiple Register Sets 

To further Increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 

Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
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Table 2. Memory Addressing Modes 


• 12-Bit Offset 

• 32-Bit Offset 

• Register-Indirect ' 

• Register + 12-Bit Offset 

• Register + 32-Bit Offset 

• Register + (Index-Register x Scale-Factor) 

• Register x Scale Factor + 32-Bit Displacement 

• Register + (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor is 1, 2, 4, 8 or 16 


a result, with four stack frames In the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 
oldest set of local registers In the register cache to a 


procedure stack In memory to make room for a new 
set of registers. Global register G15 Is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global and floating-point registers are 
not exchanged on a procedure call, but retain their 
contents, making them available to all procedures 
for fast parameter passing. An illustration of the reg- 
ister cache is shown in Figure 4. 
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Instruction Cache 

To further reduce memory accesses, the 80960KB 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions in a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 


In essence, the two unrelated Instructions between 
the LOAD and ADD instructions are executed for 
free (i.e., take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three LOAD instructions can be pending at 
one time with three corresponding scoreboard bits 
set. By exploiting this feature, system programmers 
and compilers have a useful tool for optimizing exe- 
cution speed. 


Floating-Point Arithmetic 


To load the instruction cache, instructions are 
fetched in 1 6-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when It is 
needed. 

Code for small loops will often fit entirely within the 
cache, leading to a great Increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure’s return. 


Register Scoreboarding 

The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 
do instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed. It can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example: 

LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated instruction 
ADD R4, R5, R6 


In the 80960KB, floating-point arithmetic has been 
made an integral part of the architecture. Having the 
floating-point unit integrated on-chip provides two 
advantages. First, it improves the performance of 
the chip for floating-point applications, since no 
additional bus overhead is associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations is reduced because a 
separate coprocessor chip is not required. 

The 80960KB floating-point (real number) data types 
include single-precision (32-bit), double-precision 
(64-bit), and extended precision (80-bit) floating- 
point numbers. Any register may be used to execute 
floating-point operations. 



The processor provides hardware support for both 
mandatory and recommended portions of IEEE 
Standard 754 for floating-point arithmetic, including 
all arithmetic, exponential, logarithmic, and other 
transcendental functions. Table 3 shows execution 
times for some representative instructions. 


Table 3. Sample Floating-Point Execution 


Times (jus) at 25 MHz 



32-Bit 

64-Bit 

Add 

0.4 

0.5 

Subtract 

0.4 

0.5 

Multiply 

0.7 

1.3 

Divide 

1.3 

2.9 

Square Root 

3.7 

3.9 

Arctangent 

10.1 

13.1 

Exponent 

11.3 

12.5 

Sine 

15.2 

16.6 

Cosine 

15.2 

16.6 
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High Bandwidth Locai Bus 

An 80960KB CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to Interrupts. Its features Include: 

• 32-blt multiplexed address/data path 

• Four-word burst capability, which allows transfers 
from 1 to 1 6 bytes at a time 

• High bandwidth reads and writes at 66.7 Mbytes 
per second 

• Special signal to Indicate whether a memory 
transaction can be cached 

Figure 5 identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the interrupt lines. 


Interrupt Handling 

The 80960KB can be interrupted in one of two ways: 
by the activation of one of four interrupt pins or by 
sending a message on the processor’s data bus. 

The 80960KB is unusual in that it automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide 8259A handshaking for expansion beyond four 
interrupt lines. 


Debug Features 

The 80960KB has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
internal 32-bit registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 

The 80960KB has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine is called automatically. 

The 80960KB also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 

Tracing is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special debug instruction. In each case, the 
80960KB executes the instruction first and then 
calls a trace handling routine (usually part of a soft- 
ware debug monitor). Further program execution is 
halted until the trace routine is completed. When the 
trace event handling routine Is completed, Instruc- 
tion execution resumes at the next instruction. The 
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Figure 5. Local Bus Signal Groups 
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80960KB’s tracing mechanisms, which are imple- 
mented completely in hardware, greatly simplify the 
task of testing and debugging software. 


FAULT DETECTION 

The 80960KB has an automatic mechanism to 
handle faults. There are ten fault types including 
trace, arithmetic, and floating-point faults. When the 
processor detects a fault, it automatically calls the 
appropriate fault handling routine and saves the cur- 
rent Instruction pointer and necessary state informa- 
tion to make efficient recovery possible. The proces- 
sor posts diagnostic information on the type of fault 
to a Fault Record. Like interrupt handling routines, 
fault handling routines are usually written to meet 
the needs of a specific application and are often in- 
cluded as part of the operating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific Information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 

BUILT-IN TESTABILITY 

Upon reset, the 80960KB automatically conducts an 
extensive internal test (self-test) of its major blocks 


of logic. Then, before executing its first instruction, it 
does a zero check sum on the first eight words in 
memory to ensure that the system has been loaded 
correctly. If a problem is discovered at any point dur- 
ing the self-test, the 80^60KB will assert its FAIL- 
URE pin and will not begin program execution. The 
self-test takes approximately 47,000 cycles to com- 
plete. 

System manufacturers can use the 80960KB’s self- 
test feature during incoming parts inspection. No 
special diagnostic programs need to be written, and 
the test Is both thorough and fast. The self-test ca- 
pability helps ensure that defective parts will be dis- 
covered before systems are shipped, and once In 
the field, the self-test makes It easier to distinguish 
between problems caused by processor failure and 
problems resulting from other causes. 


CHMOS 

The 80960KB is fabricated using Intel’s CHMOS IV 
(Complementary High Speed Metal Oxide Semicon- 
ductor) process. This advanced technology elimi- 
nates the frequency and reliability limitations of older 
CMOS processes and opens a new era In micro- 
processor performance. It combines the high per- 
formance capabilities of Intel’s Industry-leading 
HMOS technology with the high density and low 
power characteristics of CMOS. The 80960KB is 
available at 10, 16, 20 and 25 MHz. 


Table 4a. 80960KB Pin Description: L-Bus Signals 


Symbol 

Type 

Name and Function 

CLK 2 

I 

SYSTEM CLOCK provides the fundamental timing for 80960KB systems. It is 
divided by two inside the 80960KB to generate the internal processor clock. 

LAD 31 

-LADo 

I/O 

T.S. 

LOCAL ADDRESS/DATA BUS carries 32-bit physical addresses and data to and 
from memory. During an address (Ta) cycle, bits 2-31 contain a physical word 
address (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 
contain read or write data. The LAD lines are active HIGH and float to a high 
impedance state when not active. 

SIZE, which is comprised of bits 0-1 of the LAD lines during a Ta cycle, specifies 
the size of a burst transfer in words. 

LADi LADq 

0 0 1 Word 

0 1 2 Words 

1 0 3 Words 

1 1 4 Words 

ALE 

0 

T.S. 

ADDRESS-LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a Ta cycle and deasserted before the beginning of the Td state. It 

Is active LOW and floats to a high impedance state during a hold cycle (Th or Thr). 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KB Pin Description: L-Bus Signais (Continued) 


Symbol 

Type 

Name and Function 

ADS 

0 

O.D. 

ADDRESS/DATA STATUS indicates an address state. ADS is asserted every Tg 
state and deasserted during the the following Td state. For a burst transaction, 

ADS is asserted again every Td state where READY was asserted in the previous 
cycle. 

W/R 

0 

O.D. 

WRITE/READ specifies, during a Tg cycle, whether the operation is a write or 
read. It is latched on-chip and remains valid during Td cycles. 

DT/R 

0 

O.D. 

DATA TRANSMIT/RECEIVE Indicates the direction of data transfer to and from 
the L-Bus. It is low during T^ and Td cycles for a read or interrupt 
acknowledgement; it is high during Tq and Td cycles for a write. DT/R never 
changes state when DEN Is asserted (see Timing Diagrams). 

DEN 

o 

O.D. 

DATA ENABLE is asserted during Td cycles and indicates transfer of data on the 
LAD bus lines. 

READY 

1 

READY indicates that data on LAD lines can be sampled or removed. If READY is 
not asserted during a Td cycle, the Td cycle is extended to the next cycle by 
inserting a wait state (Tw), and ADS is not asserted in the next cycle. 

LOCK 

I/O 

O.D. 

BUS LOCK prevents other bus masters from gaining control of the L-Bus 
following the current cycle (if they would assert LOCK to do so). LOCK is used by 
the processor or any bus agent when it performs indivisible Read/Modify/Write 
(RMW) operations. Do not leave LOCK unconnected. It must be pulled high for the 
processor to function properly. 

For a read that is designated as a RMW-read, LOCK is examined, if asserted, the 
processor waits until it is not asserted; if not asserted, the processor asserts 

LOCK during the Ta cycle and leaves it asserted. 

A write that is designated as an RMW-wrIte deasserts LOCK In the Ta cycle. 

During the time LOCK is asserted, a bus agent can perform a normal read or write 
but no RMW operations. LOCK is also held asserted during an Interrupt- 
acknowledge transaction. 

BEi-BE^ 

0 

O.D. 

BYTE ENABLE LINES specify which data bytes (up to four) on the bus take part 

In the current bus cycle. BE3 corresponds to LAD31 -LAD24 and BEq corresponds 
to LAD7 — LADq. 

The byte enables are provided In advance of data. The byte enables asserted 
during Tg specify the bytes of the first data word. The byte enables asserted 
during Td specify the bytes of the next data word (if any), that is, the word to be 
transmitted following the next assertion of READY. The byte enables during the 

Td cycles preceding the last assertion of READY are undefined. The byte enables 
are latched on-chip and remain constant from one Td cycle to the next when 

READY is not asserted. 

For reads, the byte enables specify the byte(s) that the processor will actually use. 
L-Biis agents are required to assert only adjacent byte enables (e.g., asserting just 
BEq and BE2 is not permitted), and are required to assert at least one byte enable. 
To produce address bits Aq and Ai externally, they can be decoded from the byte 
enables. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KB Pin Description: L-Bus Signals (Continued) 


Symbol 

Type 

Name and Function 

HOLD/ 

HLDAR 

I 

HOLD: If the processor is the primary bus master (PBM), the input is interpreted 
as HOLD, a request from a secondary bus master to acquire the bus. When the 
processor receives HOLD and grants another master control of the bus, it floats 
its tri-state bus lines and then asserts HLDA and enters the T^ state. When HOLD 
is deasserted, the processor will deassert HLDA and go to either the T,- or Tg 
state. 

HOLD ACKNOWLEDGE RECEIVED: If the processor is a secondary bus master 
(SBM), the Input Is HLDAR, which Indicates, when HOLDR output Is high, that the 
processor has acquired the bus. Processors and other agents can be told at reset 

If they are the primary bus master (PBM). 

HLDA/ 

HOLDR 

0 

T.S. 

HOLD ACKNOWLEDGE: If the processor is a primary bus master, the output is 
HLDA, which relinquishes control of the bus to another bus master. 

HOLD REQUEST: For secondary bus masters (SBM), the output is HOLDR, which 
is a request to acquire the bus. The bus is said to be acquired If the agent is a 
primary bus master and does not have its HLDA output asserted, or if the agent Is 
a secondary bus master and has its HOLD input and HLDA output asserted. 

CACHE 

0 

T.S. 

CACHE indicates If an access is cacheable during a Tg cycle. It is not asserted 
during any synchronous access, such as a synchronous load or move Instruction 
used for sending an I AC message. The CACHE signal floats to a high impedance 
state when the processor Is idle. 


Table 4b. 80960KB Pin Description: Module Support Signals 


Symbol 

Type 

Name and Function 

BADAC 

I 

BAD ACCESS, if asserted in the cycle following the one In which the last READY 
of a transaction is asserted, indicates that an unrecoverable error has occurred on 
the current bus transaction, or that a synchronous load/store instruction has not 
been acknowledged. 

STARTUP: During system reset, the BADAC signal is Interpreted differently. If the 
signal is high, it indicates that this processor will perform system initialization. If It 
is low, another processor In the system will perform system initialization instead. 

RESET 

I 

RESET clears the internal logic of the processor and causes It to re-initlalize. 

During RESET assertion, the input pins are ignored (except for BADAC and 
IAC/INTq), the tri-state output pins are placed In a high impedance state, and 
other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable RESET. 

The HIGH to LOW transition of RESET should occur after the rising edge of both 
CLK2 and the external bus CLK, and before the next rising edge of CLK2. 

FAILURE 

0 

O.D. 

INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. After RESET is deasserted and before the first bus transaction begins, 
FAILURE is asserted while the processor performs a self-test. If the self-test 
completes successfully, then FAILURE is deasserted. Next, the processor 
performs a zero checksum on the first eight words of memory. If it fails, FAILURE 
is asserted for a second time and remains asserted; if it passes, system 
initialization continues and FAILURE remains deasserted. 

N.C. 

N/A 

NOT CONNECTED indicates pins should not be connected. Never connect any 
pin marked N.C. 


I/O = Input/Output, O = Output, I = Input. O.D. = Open-Drain, T.S. = tri-state 
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Table 4b. 80960KB Pin Description: Module Support Signals (Continued) 


Symbol 

Type 

Name and Function 

lAC 

INTO 

I 

INTERAGENT COMMUNICATION REQUEST/INTERRUPT 0 indicates either 
that there is a pending lAC message for the processor or an interrupt. The bus 
Interrupt control register determines in which way the signal should be interpreted. 

To signal an interrupt or I AC request In a synchronous system, this pin (as well as 
the other interrupt pins) must be enabled by being deasserted for at least one bus 
cycle and then asserted for at least one additional bus cycle; In an asynchronous 
system, the pin must remain deasserted for at least two bus cycles and then be 
asserted for at least two more bus cycles. 

LOCAL PROCESSOR NUMBER:- This signal Is interpreted differently during 
system reset. If the signal is at a high voltage level, it indicates that this processor 
is a primary bus master (Local Processor Number = 0); if it is at a low voltage 
level. It indicates that this processor is a secondary bus master (Local Processor 
Number =1). 

INT1 

I 

INTERRUPT 1, like INTO, provides direct interrupt signaling. 

INT2/ 

INTR 

I 

INTERRUPT 2/INTERRUPT REQUEST: The bus control registers determines 
how this pin is interpreted. If INT2, it has the same interpretation as the INTO and 

INT1 pins, jf INTR, it Is used to receive an Interrupt request from an external 
interrupt controller. 

iNT3/ 

INTA 

I/O 

O.D. 

INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The bus interrupt control register 
determines how this pin is interpreted. If INT3, it has the same interpretation as 
the INTO, INTI, and INT2 pins. If INTA, It is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles; as an output. It is open-drain. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 


ELECTRICAL SPECIFICATIONS 


Power and Grounding 

The 80960KB is implemented in CHMOS IV technol- 
ogy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error, and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 12 Vcc 
and 1 3 Vss pins separately feed functional units of 
the 80960KB in the PGA. 

Power and ground connections must be made to all 
power and ground pins of the 80960KB. On the cir- 
cuit board, all Vcc Pi^s must be strapped closely 
together, preferably on a power plane. Likewise, all 
Vss pins should be strapped together, preferably on 
a ground plane. These pins may not be connected 
together within the chip. 


Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960KB. The processor can cause tran- 
sient power surges when driving the L-Bus, particu- 
larly when it is connected to a large capacitive load. 

Low inductance capacitors and Interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and de- 
coupling capacitors as much as possible. Capacitors 
specifically designed for PGA packages are also 
commercially available and offer the lowest possible 
inductance. 


Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 
one or more interrupt lines are not used, they should 
be pulled up. No inputs should ever be left floating. 
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All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid Vih (^3.4V) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 100n. The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 


Characteristic Curves 

Figure 7 shows the typical supply current require- 
ments over the operating temperature range of the 
processor at supply voltage (Vcc) of 5V. Figure 8 
shows the typicaf power supply current (Iqc) re- 
quired by the 80960KB at various operating frequen- 
cies when measured at three input voltage (Vcc) 
levels. 

For a given output current (Iql). fh© curve in Figure 9 
shows the worst case output low voltage (Vql)- 


Figure 10 shows the typical capacitive derating 
curve for the 80960KB measured from 1 .5V on the 
system clock (CLK) to 1 .5V on the falling edge and 
1.5V on the rising edge of the L-Bus address/data 
(LAD) signals. 


Test Load Circuit 

Figure 1 3 illustrates the load circuit used to test the 
80960KB’s tristate pins, and Figure 14 shows the 
load circuit used to test the open drain outputs. The 
open drain test uses an active load circuit In the form 
of a matched diode bridge. Since the open-drain out- 
puts sink current, only the Iql legs of the bridge are 
necessary and the Iqh legs are not used. When the 
80960KB driver under test is turned off, the output 
pin is pulled up to Vref (••6-. Vqh)- Diode D-i is 
turned off and the Iql current source flows through 
diode D 2 . 


When the 80960KB open-drain driver under test is 
on, diode Di is also on, and the voltage on the pin 
being tested drops to Vql- Diode D2 turns off and 
Iql flows through diode Di. 





Figure 6. Connection Recommendations for Low and High Current Drive Networks 
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Figure 9. Worst Case Voltage vs Output 
Current on Open-Drain Pins 


Figure 10. Capacitive Derating Curve 
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ABSOLUTE MAXIMUM RATINGS* 

Operating Temperature 0®C to + 85®C Case 

Storage Temperature -65®C to +150*^0 

Voltage on Any Pin -0.5V to Vcc + 0-5V 

Power Dissipation 2.5W (25 MHz) 


NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


DC CHARACTERISTICS 

PGA: 

80960KB (16 MHz): Tqase = O^’C to + 85"C, Vcc = 5V ± 10% 
80960KB (20 and 25 MHz): Tcase = 0°C to + 85"C, Vcc = 5V ± 5% 


PQFP: 

80960KA (10 and 16 MHz); Tcase = 0°C to +100°C, Vcc = 5V ±10% 
80960KA (20 MHz): Tcase = 0“C to +100“C, Vcc = 5V ±5% 



Parameter 

Min 

Max 

Units 

Test Conditions 

V|L 

Input Low Voltage 

-0.3 

+ 0.8 

V 



Input High Voltage 

2.0 

Vcc + 0.3 

V 



CLK2 Input Low Voltage 

-0.3 

+ 0.8 

V 


VCH 

CLK2 Input High Voltage 

0.55 Vcc 

Vcc + 0.3 

V 


Vql 

Output Low Voltage 


0.45 

V 

(1.5) 

VOH 

Output High Voltage 

2.4 


V 

(2.4) 

icc 

Power Supply Current: 






10 MHz 


300 

mA 



16 MHz 


375 

mA 



20 MHz 


420 

mA 



25 MHz 


480 

mA 


Ili 

Input Leakage Current 


±15 

jitA 

0 ^ V|N ^ Vcc 

•lo 

Output Leakage Current 


±15 

jllA 

0.45 ^ Vq ^ Vcc 

C|N 

Input Capacitance 


10 

PF 

fc = 1 MHz(3) 

Co 

I/O or Output Capacitance 

j 

12 

PF 

fc = 1 MHz(3) 

CcLK 

Clock Capacitance 

j 

10 

pF 

fc = 1 MHz(3) 


NOTES: 

1 . For tri-state outputs, this parameter is measured at; 

Address/Data 4.0 mA 

Controls 5.0 mA 

2. This parameter is measured at: 

Address/ Data ~ 1 .0 mA 

Controls -0.9 mA 

ALE -5.0 mA 

3. Input, output, and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. For open-drain outputs 25 mA 
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For input timings, the specifications refer to the time 
at which the signal reaches (for input setup) or 
leaves (for hold time) the TTL levels of LOW (0.8V) 
or HIGH (2.0V). All AC testing should be done with 
input clock voltages of 0.4V and 2.4V, except for the 
clock (CLK2), which should be tested with input volt- 
ages of 0.45 Vcc and 0.55 Vcc- 



Figure 11. Drive Leveis and Timing Relationships for 80960KB Signals 


AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960KB pins. All input and output timings are spec- 
ified relative to the 1 .5V level of the rising edge. For 
output timings, the specifications refer to the time it 
takes the signal to reach 1 .5V. 
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AC Specification Tables 

80960KB AC Characteristics (10 MHz, PQFP Only) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

50 

125 

ns 

V|N = 1.5 V 

T 2 

Processor Clock 

Low Time (CLK2) 

12 


ns 

ViL = 10% Point 
= 1.2V 

T3 

Processor Clock 

High Time (CLK2) 

12 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

T 5 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

25 

ns 

Cl = 100 pF (LAD) 

Cl = 75 pF (Controls)(2) 

Tsh 

HOLDA Output 

Valid Delay 

4 

31 

ns 

Cl = 75 pF 

Ty 

ME Width 

25 


ns 

Cl = 75 pF 

Te 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 75 pF(2) 

T9 

Output Float 

Delay 

2 

20 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

T 9 H 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 75 pF 

T10 

Input Setup 1 

3 


ns 


T11 

Input Hold 

5 


ns 


T 1 IH 

HOLD Input Hold 

4 


ns 


Ti2 

Input Setup 2 

8 


ns 


Ti3 

Setup to ALE 

Inactive 

10 i 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti4 

Hold after ALE 

Inactive 

8 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Ti6 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

1640 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. lAC/INTo, INTi, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilq. Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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80960KB AC Characteristics (16 MHz) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

31.25 - 

125 

ns 

V|N = 1.5V 

T 2 

Processor Clock 

Low Time (CLK2) 

8 


ns 

V|L = 10% Point 
= 1.2V 

T 3 

Processor Clock 

High Time (CLK2) 

8 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T 4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

25 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Tsh 

HOLDA Output 

Valid Delay 

4 

31 

ns 

Cl = 75 pF 

T 7 

ALE Width 

15 


ns 

Cl = 75 pF 

Ta 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 75 pF(2) 

Ta 

Output Float 

Delay 

2 

20 

ns 

Cl = 100 pF (LAD) 

Cl = 75 pF (Controls)(2) 

Tqh 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 75 pF 

T 10 

Input Setup 1 

3 


ns 


T 11 

Input Hold 

5 


ns 


wssm 


4 


ns 


Ti2 

Input Setup 2 

8 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = 100 pF (LAD) 

Cl = 75 pF (Controls) 

Ti4 

Hold after ALE 

Inactive 

8 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Ti6 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

1281 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1 . lAC/INTo, INTi, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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80960KB AC Characteristics (20 MHz) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 

Period (CLK2) 

25 

125 

ns 

V|N = 1.5V 

T2 

Processor Clock 

Low Time (CLK2) 

6 


ns 

V|L = 10% Point 
= 1.2V 

T3 

Processor Clock 

High Time (CLK2) 

6 


ns 

V|H = 90% Point 
= 0.1V -f 0.5 Vcc 

T4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 

Point 

T5 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 

Point 

Te 

Output Valid 

Delay 

2 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Tsh 

HOLDA Output 

Valid Delay 

4 

26 

ns 

Cl = 50 pF 

T7 

ALE Width 

12 


ns 

Cl = 50 pF 

Ts 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 50 pF(2) 

Tg 

Output Float 

Delay 

2 

20 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls)(2) 

TgH 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 50 pF 

T10 

Input Setup 1 

, 3 


ns 


T11 

Input Hold 

5 


ns 


T11H 

HOLD Input Hold 

4 


ns 


Ti2 

Input Setup 2 

7 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti4 

Hold after ALE 

Inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Ti6 

Reset Setup 

5 


ns 


Ti7 

Reset Width . 

1025 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. lAC/INTo, INTi, INT2/INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than l|_o. Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 



Tri-State Output Pins 
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80960KB AC Characteristics (25 MHz, PGA Only) 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor 

Clock Period (CLK2) 

20 

125 

ns 

V|N = 1.5V 

T2 

Processor Clock 

Low Time (CLK2) 

5 


ns 

V|i_ = 10% Point 
= 1.2V 

T3 

Processor Clock 

High Time 

5 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V||M = 10% Point to 90% 
Point 

Te 

Output Valid 

Delay 

2 

18 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Tsh 

HOLDA Output 

Valid Delay 

4 

24 

ns 

Cl = 50 pF 

Tt 

ALE Width 

12 


ns 

Cl = 50 pF 

Ts 

ALE Output Valid Delay 

0 

20 

ns 

Cl = 50 pF (2) 

Tg 

Output Float 

Delay 

2 

18 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Tqh 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 50 pF 

Tio 

Input Setup 1 

3 


ns 


Tii 

Input Hold 

5 


ns 


Tiih 

HOLD Input Hold 

4 


ns 


Ti2 

Input Setup 2 

7 


ns 


Ti3 

Setup to ALE 

Inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti4 

Hold after ALE 

Inactive 

8 * 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Tie 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

820 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. lAC/INTO, INT1, INT2/INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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Figure 17. Hold Timing 


Design Considerations when designing an 80960KB hardware system that 

^ uses the ICE-960KB to debug the system, several 

Input hold times can be disregarded by the designer electrical and mechanical characteristics should be 

whenever the Input is removed because a subse- considered. These considerations include capacitive 

quen t output from the processor is deasserted (e.g., loading, drive requirement, power requirement and 
DEN becomes deasserted). physical layout. 

Whenever the processor generates an output that The ICE-960KB probe module increases the load 
indicates a transition into a subsequent state, any capacitance of each line by up to 25 pF. It also adds 

outputs that are specified to be tri-stated in this new one standard Schottky TTL load on the CLK2 line, 

state are guaranteed to be tri-stated. For example. In up to one advanced low-power Schottky TTL load 
the Td cycle following a Ta cycle for a read, the mini- for each control signal line, and one advanced low- 

mum output delay of DEN is 2 ns, but th e maximum power Schottky TTL load for each address/data and 

float time of LAD is 20 ns. When DEN is asserted, byte enable line. These loads originate from the 

however, the LAD outputs are guaranteed to have probe module and are driven by the 80960KB proc- 
been tri-stated. essor. 

To achieve high noise immunity, the ICE-960KB 
Designing for the ICE-960KB probe is powered by the user’s system. The high- 

speed probe circuitry draws up to 1.1 A plus the max- 
The 80960KB In-Circuit Emulator assists in debug- imum current (Ice) of the 80960KB processor, 
ging both 80960KA and 80960KB hardware and 

software designs. The product consists of a probe The mechanical considerations are shown in Figure 

module, cable, and control unit. Because of the high 18, which illustrates the lateral clearance require- 

operating frequency of 80960KB systems, the probe ments for the ICE-960KB probe as viewed from 

module connects directly to the 80960KB socket. above the socket of the 80960KB processor. 
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Figure 18. ICE-960KB Lateral Clearance Requirements 


MECHANICAL DATA 


Package Dimensions and Mounting 

The 80960KB is available in two different packages: 
a 132-lead ceramic pin-grid array (PGA) and a 132- 
lead plastic quad flat pack (PQFP). Pins in the ce- 
ramic package are arranged 0.100 inch (2.54 mm) 
center-to-center, in a 14 by 14 matrix, three rows 
around. (See Figure 19.) The plastic package uses 
fine-pitch gull wing leads arranged in a single row 
along the perimeter of the package with 0.025 inch 
(0.64 mm) spacing. (See Figure 20.) Dimensions are 
given in Figure 21 and Table 7. 

There are a wide variety of sockets available for the 
ceramic PGA package including low-insertion or 
zero-insertion force mountings, and a choice of ter- 
minals such as soldertail, surface mount, or wire 
wrap. Several applicable sockets are shown in Fig- 
ure 22. 

The PQFP is normally surface mounted to take best 
advantage of the plastic package’s small footprint 
and low cost. In some applications, however, de- 
signers may prefer to use a socket, either to improve 


heat dissipation or reduce repair costs. Figures 23a 
and 23b show two of the many sockets available. 


Pin Assignment 

The PGA and PQFP have different pin assignments. 
Figure 24 shows the view from the bottom of the 
PGA (pins facing up) and Figure 25 shows a view 
from the top of the PGA (pins facing down). Figures 
20 and 32 show the top view of the PQFP; notice 
that the pins are numbered in order from 1 to 132 
around the package’s perimeter. Tables 5 and 6 list 
the function of each pin in the PGA, and Tables 8 
and 9 list the function of each pin in the PQFP. 

Vcc and GND connections must be made to multi- 
ple Vcc and GND pins. Each Vcc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. We 
recommend that you include separate power and 
ground planes in your circuit board for power distri- 
bution. 

NOTE: 

Pins identified as N.C., “No Connect,’’ should never 
be connected. 
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Package Thermal Specification 

The 80960KB is specified for operation when case 
temperature is within the range 0®C to + 85“C (PGA) 
or +100°C (PQFP). The case temperature should 
be measured at the top center of the package as 
shown in Figure 26. 

The ambient temperature can be calculated from 
and 0ja by using the following equations: 

Tj = Tc + P*0jc 

Ta = Tj - P*0ja 

Tc = Ta + Pn^ja - ^jc] 


Values for 0ja and 0jc are given in Table 10 for the 
PGA package and in Table 1 1 for the PQFP for vari- 
ous airflows. Note that the 0ja for the PGA package 
can be reduced by adding a heatsink, while a heat- 
sink is not generally used with the plastic package 
since it is intended to be surface mounted. The max- 
imum allowable ambient temperature (Ta) permitted 
without exceeding Tc is shown by the charts in Fig- 
ures 27 through 30 for 10 MHz, 16 MHz, 20 MHz, 
and 25 MHz respectively. 


SUPPORT COMPONENTS 


85C960 Burst Bus Controller 

The Intel 85C960 performs burst logic, ready gener- 
ation, and address decode for the 80960KA and 
80960KB. The burst logic supports both standard 
and burst mode memories and peripherals. The 
ready generation and timing control supports 0 to 15 
wait states across eight address ranges for read/ 
write and burst accesses. The address decoder de- 
codes eight address inputs into four external and 
four internal chip selects. The wait state and chip 
select values may be programmed by the user; the 
timing control and burst logic are fixed. 


The 85C960 operates with the 80960KA and 
80960KB at all frequencies and consumes only 50 
mA at 25 MHz. The 85C960 is housed in a 28-pin, 
300-mil ceramic DIP and plastic DIP packages or 28- 
pin PLCC package for surface mount. In the ceramic 
DIP package the part is UV-erasable, which makes it 
easy to revise designs. Order the 85C960 data sheet 
(No. 290192) for full details. 



The curves assume the maximum permitted supply 
current (Ice) at ©ach speed, Vec of 5.0V, and a 
Tcase of +85“C (PGA) or + 100°C (PQFP). 

If you will be using the 80960KB in a harsh environ- 
ment where the ambient temperature may exceed 
the limits for the normal commercial part, you should 
consider using an extended temperature part. These 
parts are designed by the prefix “TA” and are avail- 
able at 16, 20 and 25 MHz in the ceramic PGA pack- 
age. The extended operating temperature range is 
-40°C to + 125‘’C case. Figure 30 shows the maxi- 
mum allowable ambient temperature for the 20 MHz 
extended temperature TA80960KB at various air- 
flows. The curve assumes an Ice of 420 mA, Vec of 
5.0V, and a Tcase of + 125°C. 


WAVEFORMS 

Figures 33 through 38 show the waveforms for vari- 
ous transactions on the 80960KB’s local bus. 


27960KX Burst Mode EPROM 

Intel 27960KX one-megabit EPROM is designed 
specifically to support the 80960KA and 80960KB. It 
uses a burst interface to offer near zero wait-state 
performance without the high cost of alternative 
memory technologies. The 27960KX removes the 
need for “dumping” code and data stored in slow 
EPROMs or ROMs into expensive high-speed 
“shadow” RAM. 

Internally, the 27960KX is organized in blocks of four 
bytes that are accessed sequentially. The address 
of the four-byte block is latched and Incremented 
internally. After a set number of wait-states (1 or 2), 
data is output one word at a time each subsequent 
clock cycle. High-performance outputs provide zero 
wait-state data-to-data burst accesses. Extra power 
and ground pins dedicated to the output reduce the 
effect of fast output switching on the device. The 
27960KX offers 1 -0-0-0 performance at 20 MHz and 
2-0-0-0 performance at 25 MHz. Full details can be 
found in the 27960KX data sheet (No. 290337). 
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Figure 19. A 132-Lead Pin-Grid Array (PGA) Used to Package the 80960KB 
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Figure 2 Id. Board Footprint Area for the 132-Lead PQFP 
Table 7. Package Dimension: 80960KB PQFP 



Description 

Inches 

MM 

Min 

Max 

Min 

Max 

N 

Leadcount 

132 Leads 

132 Leads 

A 

Package Height 

0.160 

0.170 

4.060 

4.320 

A1 

Standoff 

0.020 

0.030 

0.510 

0.760 

D,E 

Terminal Dimension 

1.075 

1.085 

27.310 

27.560 

D1,E1 

Package Body 

0.947 

0.953 

24.050 

24.210 

D2,E2 

Bumper Distance 

Without Flash 

1.097 

1.103 

27.860 

28.010 


With Flash 

1.097 

1.110 

27.860 

28.190 

D3,E3 

Lead Dimension 

0.800 REF 

20.32 REF 

D4,E4 

Foot Radius Location 

1.023 

1.037 

25.890 

26.330 

LI 

Foot Length 

0.020 

0.030 

0.510 

0.760 
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• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in 
insertion force compared to 
machined sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in 
version 55573-2 

Amp Incorporated 
(Harrisburg, PA 17105 U.S.A 
Phone 717-564-0100) 



Cam handle locks in low profile position when 80960KB is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 


Peel-A-Way* Mylar and Kapton 
Socket Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 


Advanced Interconnections 

(5 Division Street) 

Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 


Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MSI 32 

Molded Plastic Body KS132 
is shown below: 



270565-14 


• Low insertion force wire-wrap • 
CS132-02TG (two-level) 
CS132-03TG (thee-level) 

• Low insertion force press-fit 
CS132-05TG 



270565-15 

Courtesy Advanced Interconnections 
(Peel-A-Way Terminal Carriers 
U.S. Patent No. 4442938) 


* Peel-A-Way is a trademark of Advanced Interconnections. 


Figure 22. Several Socket Options for Mounting the 80960KB 
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Figure 23a. AMP Micropitch Socket for the 132-Lead Plastic 
Quad Flat Pack, 0.025" Lead Spacing, Gull Wing Leads 
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Figure 23b. 3M Company PQFP Socket and Lid 
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Figure 24. 80960KB PGA Pinout— View from Bottom (Pins Facing Up) 
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Figure 25. 80960KB PGA Pinout — View from Top (Pins Facing Down) 
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Table 5. 80960KB PGA Pinout— In Pin Order 


Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

A1 

Vcc 


LAD 20 




Vss 

A2 

Vss 

EH 

LAD 13 


^0 



A3 

LADi 9 


LADq 

H3 

LOCK 



A4 

LADi 7 

C9 

LAD 3 

H12 

N.C 



A5 

LADi 6 

CIO 

Vcc 

H13 




A 6 

LADi 4 

C 11 

Vss 

H14 

N.C. 



A7 

LADii 

Cl 2 

nTTa/lNTA 

J1 

DT/R 


N.C. 

A 8 

LADg 

Cl 3 

INT 1 

J2 

BE 2 


N.C. 

A9 

LADy 

C14 

L^/InTq 

J3 

Vss 


N.C. 

A10 

. LAD 5 

D 1 

ALE 

J12 

N.C 


N.C. 

All 

LAD 4 

D2 

ADS 

J13 

N.C. 


N.C. 

A12 

LADi 

D3 

HLDA/HLDR 

J14 

N.C. 


N.C. 

A13 

INT 2 /INTR 

D12 

Vcc 

K1 

^3 


N.C. 

A14 

Vcc 

D13 

N.C. 

K2 

FAILURE 



B1 

LAD 23 

D14 

N.C. 

K3 

Vss 


N.C. 

B2 

LAD 24 

El 

LAD 28 

K12 

Vcc 


N.C. 

B3 

LAD 22 

E2 

LAD 26 

K13 

N.C. 



B4 
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E3 

LAD 27 

. K14 

N.C. 



B5 
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N.C. 

LI 

DEN 



B 6 
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E13 

Vss 

L2 

N.C. 

PI 


B7 

LAD 12 

E14 

N.C. 

L3 

Vcc 

P2 


B 8 

LAD 10 

FI 

LAD 29 


Vss 

P3 


B9 

LADe 

F2 

LAD 31 


N.C. 

P4 
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LAD 2 

F3 

CACHE 



P5 


B11 

CLK2 

FI 2 

N.C. 


N.C. 

P 6 

N.C. 

B12 

LADo 

F13 

N.C. 


Vcc 

P7 

N.C. 

B13 

RESET 

FI 4 

N.C. . 


Vss 

P 8 

N.C. 

B14 

Vss 

G 1 

LAD 30 



P9 

N.C. 

Cl 

HOLD/HLDAR 


READY 


Vcc 

P10 

N.C. 

C2 

LAD 25 

G3 

BEi 

M 6 

N.C. 

P11 

N.C. 

C3 

BADAC 

G12 

N.C. 

M7 

N.C. 

P12 

N.C. 

C4 

Vcc 

G13 

N.C. 

M 8 

N.C. 

P13 

Vss 

C5 

Vss 

G14 

N.C. 

M9 

N.C. 

P14 

Vcc 
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Table 6. 80960KB PGA Pinout — In Signal Order 


Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

ADS 

D2 

LAD 15 

B 6 

N.C. 

J14 

N.C. 

P9 

PlE 

D1 

LAD 16 

A5 , 

N.C. 

K13 

N.C. 

P10 

BADAC 

C3 

LAD 17 

A4 

N.C. 

K14 

N.C. 

P11 

BE^ 

H2 

LAD 18 

B5 

N.C. 

LI 3 

N.C. 

P12 

BET 

G3 

LADi 9 

A3 

N.C. 

LI 4 

N.C. 

L2 

BE^ 

J2 

LAD 2 O 

C 6 

N.C. 


READY 

G2 

BE^ 

K1 


B4 

N.C. 

mm 

RESET 

B13 

CACHE 

F3 


B3 

N.C. 


Vcc 

A1 

CLK2 

B11 


B1 

N.C. 


Vcc 

A14 

Den 

LI 


B2 

N.C. 


Vcc 

C4 

DT/R 

J1 

LAD 25 

C2 

N.C. 


Vcc 

CIO 

FAILURE 

K2 

LAD 26 

E2 

N.C. 


Vcc 

D12 

HLDA/HOLDR 

D3 

LAD 27 


N.C. 


< 

0 

0 

K12 

HOLD/HLDAR 

Cl 


E1 

N.C. 
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L3 


C14 
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FI 

N.C. 
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INTi 
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N.C. 
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M5 




F2 

N.C. 
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Mil 





N.C. 
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D13 

N.C. 
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mm 

N.C. 

D14 

N.C. 
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mm 

N.C. 

E12 
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N.C. 

mm 
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N.C. 

msm 

N.C. 
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C 11 
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N.C. 


N.C. 


Vss 

E13 
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N.C. 

FI 4 
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H14 
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P 6 
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N1 

LAD 13 
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J 12 
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Figure 26. Measuring 80960KB PGA and PQFP Case Temperature 



■ PQFP DPGA with no ♦PGA with omni- OPGA with uni- 
heatsink directional heatsink directional heatsink 
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Figure 27. 10 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 28. 16 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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■ PQFP DPGA with no ♦PGA with omni- OPGA with uni- 
heatsink directional heatsink directional heatsink 
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Figure 29. 20 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 30. Maximum Allowable Ambient Temperature for 
the 80960KB at 25 MHz (available In PGA only) 
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Figure 31. Maximum Allowable Ambient Temperature for the Extended 
Temperature TA-80960KB at 20 MHz (available in PGA only) 
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Table 8. 80960KB Plastic Package Pinout — In Pin Order 


Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

1 

HLDA/HOLR 

34 

N.C. 

67 

Vss 

100 

LADO 

2 

Me 

35 

Vcc 

68 

Vss 

101 

LAD1 

3 

LAD26 

36 

Vcc 

69 

N.C. 

102 

LAD2 

4 

LAD27 

37 

N.C. 

70 

Vcc 

103 

Vss 

5 

LAD28 

38 

N.C. 

71 

Vcc 

104 

LAD3 

6 

LAD29 

39 

N.C. 

72 

N.C. 

105 

LAD4 

7 

LAD30 

40 

N.C. 

73 

Vss 

106 

LAD5 

8 

LAD31 

41 

Vcc 

74 

Vcc 

107 

LAD6 

9 

Vss 

42 

Vss 

75 

N.C. 

108 

LAD7 

10 

CACHE 

43 

N.C. 

76 

N.C. 

109 

LAD8 

11 

W/R 

44 

N.C. 

77 

N.C. 

110 

LAD9 

12 

READY 

45 

N.C. 

78 

N.C. 

111 

LAD10 

13 

DT/R 

46 

N.C. 

79 

Vss 

112 

LAD11 

14 

BEO 

47 

N.C. 

80 

Vss 

113 

LAD12 

15 

BET 

48 

N.C. 

81 

N.C. 

114 

Vss 

16 


49 

N.C. 

82 

Vcc 

115 

LAD13 

17 

BE3 

50 

N.C. 

83 

Vcc 

116 

LAD14 

18 

FAILURE 

51 

N.C. 

84 

Vss 

117 

LAD15 

19 

Vss 

52 

Vss 

85 

I^/InTo 

118 

LAD16 

20 

LOCK 

53 

Vss 

86 

INTI 

119 

LAD17 

21 

DEN 

54 

N.C. 

87 

INT2/INTR 

120 

LAD18 

22 

Vss 

55 

Vcc 

88 

INT3/INTA 

121 

LAD19 

23 

Vss 

56 

Vcc 

89 

N.C. 

122 

LAD20 

24 

N.C. 

57 

Vss 

90 

Vss 

123 

LAD21 

25 

N.C. 

58 

N.C. 

91 

CLK2 

124 

LAD22 

26 

Vss 

59 

N.C. 

92 

Vcc 

125 

Vss 

27 

Vss 

60 

N.C. 

93 

RESET 

126 

LAD23 

28 

N.C. 

61 

N.C. 

94 

N.C. 

127 

LAD24 

29 

Vcc 

62 

N.C. 

95 

N.C. 

128 

LAD25 

30 

Vcc 

63 

N.C. 

96 

N.C. 

129 

BADAC 

31 

N.C. 

64 

N.C. 

97 

N.C. 

130 

HOLD/HLDAR 

32 

Vss 

65 

N.C. 

98 

N.C. 

131 

N.C. 

33 

Vss 

66 

N.C. 

99 

Vss 

132 

WS 
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Table 9. 80960KB Plastic Package Pinout— In Signal Order 


Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

ADS 

132 

LAD22 

124 

N.C. 

49 

Vcc 

41 

ALE 

2 

LAD23 

126 

N.C. 

50 

Vcc 

55 

BADAC 

129 

LAD24 

127 

N.C. 

51 

Vcc 

56 

BEO 

14 

LAD25 

128 

N.C. 

54 

Vcc 

70 

BET 

15 

LAD26 

3 

N.C. 

58 

Vcc 

71 


16 

LAD27 

4 

N.C. 

59 

Vcc 

74 

BE3 

17 

LAD28 

5 

N.C. 

60 

Vcc 

82 

CACHE 

10 

LAD29 

6 

N.C. 

61 

Vcc 

83 

CLK2 

91 

LAD3 

104 

N.C. 

62 

Vcc 

92 

DB^ 

21 

LAD30 

7 

N.C. 

63 

Vss 

9 

DT/R 

13 

LAD31 

8 

N.C. 

64 

Vss 

19 

FAILURE 

18 

LAD4 

105 

N.C. 

65 

Vss 

22 

HLDA/HOLR 

1 

LADS 

106 

N.C. 

66 

Vss 

23 

HOLD/HLDAR 

130 

LAD6 

107 

N.C. 

69 

Vss 

26 

Iac/InTo 

85 

LAD7 

108 

N.C. 

72 

Vss 

27 

INTI 

86 

LAD8 

109 

N.C. 

75 


32 

INT2/INTR 

87 

LAD9 

110 

N.C. 

76 


33 

INT3/INTA 

88 

LOCK 

20 

N.C. 

77 

Vss 

42 

LADO 

100 

N.C. 

24 

N.C. 

78 

Vss 

52 

LAD1 

101 

N.C. 

25 

N.C. 

81 

Vss 

53 

LAD10 

111 

N.C. 

28 

N.C. 

89 

Vss 

57 

LAD11 

112 

N.C. 

31 

N.C. 

94 

Vss 

67 

LAD12 

113 

N.C. 

34 

N.C. 

95 

Vss 

68 

LAD13 

115 

N.C. 

37 

N.C. 

96 

Vss 

73 

LAD14 

116 

N.C. 

38 

N.C. 

97 

Vss 

79 

LAD15 

117 

N.C. 

39 

N.C. 

98 

Vss 

80 

LAD16 

118 

N.C. 

40 

N.C. 

131 

Vss 

84 

LAD17 

119 

N.C. 

43 


12 

Vss 

90 

LAD18 

120 

N.C. 

44 


93 

Vss 

99 

LAD19 

121 

N.C. 

45 

Vcc 

29 

Vss 

103 

LAD2 

102 

N.C. 

46 

Vcc 

30 

Vss 

114 

LAD20 

122 

N.C. - 

47 

Vcc 

35 

Vss 

125 

LAD21 

123 

N.C. 

48 

Vcc 

36 

W/R 

11 
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Table 10. 80960KB PGA Package Thermal Characteristics 


Thermal Resistance— °C/Watt 

Parameter 

Airflow— ft./mln (m/sec) 

0 

(0) 

so 

(0.25) 

100 

(0.50) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

6 Junction-to-Case 
(Case Measured 
as shown in Figure 26) 

2 





2 


0 Case-to-Ambient 
(No Heatsink) 

19 

18 

17 



10 


6 Case-to-Ambient 
(with Omnidirectional 
Heatsink) 

16 

15 

14 

12 

9 

7 

6 

0 Case-to-Ambient 
(with Unidirectional) 
Heatsink) 

15 

14 

13 

11 

8 

6 

5 


"J pin 






J 

BIHDHIi 


270565-45 


NOTES: 

1 . This table applies to 80960KB PGA 3. 8j.cap = 4°C/w (approx.) 

plugged into socket or soldered di- ^j-pin = 4“C/w (inner pins) (approx.) 
rectly into board. ^j-PIN = 8°C/w (outer pins) (approx.) 

2. 0JA = ^JC + ^CA- 


Table 11. 80960KB PQFP Package Thermal Characteristics 


PQFP Thermal Resistance— X/Watt 

Parameter 

Airflow— ft./min (m/sec) 

0 

(0) 

so 

(0.25) 

100 

(0.50) 

200 

(1.01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

d Junction-to-Case 
(Case Measured 
as shown in Figure 26) 

9 

9 

9 

'9 

9 

9 

9 

d Case-to-Ambient 
(No Heatsink) 

22 

19 

18 

16 

11 

9 

8 


NOTES: 

1. This table applies to 80960KB 3. 0 jl = 18"C/Watt 

PQFP soldered directly into board. djB = lO^C/Watt 

2. ^JA = ^JC + ^CA- 



270565-44 
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Figure 37. Interrupt Acknowledge Transaction 


3-126 







80960KB 


int^. 



Figure 38. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 
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80960CA PRODUCT OVERVIEW 


1.0 PURPOSE 

The 80960CA Product Overview is a summary of the 
features and operation of Intel’s 80960CA Embedded 
Processor. The Product Overview is intended for those 
who are not familiar with the 80960 architecture or the 
80960CA, a product built around this architecture. The 
80960CA Product Overview provides a programmer or 
a system designer with a quick, global view of software 
and hardware design considerations for the 80960CA. 
For further information, refer to the following refer- 
ence documents: 

— The 80960CA User's Manual contains detailed tech- 
nical information and examples for designing em- 
bedded systems using the 80960CA. 

— The 80960CA Data Sheet provides electrical specifi- 
cations for the device, such as the DC and AC pa- 
rameters, operating conditions, and packaging spec- 
ifications. 


2.0 80960CA 32-BIT EMBEDDED 
PROCESSOR 

The 80960CA (Figure 2-1) is optimized for embedded 
processing applications. This product features the high- 
performance C-Series core plus built-in system periph- 
erals, effectively integrating a high-speed CPU and sys- 
tem components onto a single silicon die. The 80960CA 
is a member of Intel’s 80960 embedded processor fami- 
ly. Each member of the 80960 family is based on a 
common architectural definition referred to as the core 
architecture. 

An 80960 family member, such as the 80960CA, is 
made up of an implementation of the core architecture 
plus application-specific extensions. These extensions 
may consist of integrated peripherals, instruction-set 
extensions, or additional registers and caches beyond 
those defined by the architecture. The common core 
architecture provides a basis for code compatibility for 
all 80960 family products, while application-specific ex- 
tensions optimize a particular product for a class of 
applications. 

The 80960 architectural target is the execution of mul- 
tiple instructions per clock (i.e., fractional clocks per 
instruction). By defining an architecture which sup- 
ports parallel instruction execution and out-of-order in- 
struction execution, performance advances are not con- 
strained by the system clock. 

The 80960CA is capable of launching and executing 
instructions in parallel. This is accomplished by the use 
of advanced silicon technology as well as innovative 
“microarchitectural” constructs. The term microarchi- 


tecture refers to the implementation of the instruction 
set and programming resources. For example, different 
microarchitectures may have different pipeline con- 
struction, internal bus widths, register set porting, de- 
grees of parallelism, and cache parameterization (two- 
way, four-way, etc.). 

A principal objective of the 80960 architecture is to 
provide the framework to allow microarchitectural ad- 
vances to translate directly into increased performance 
without architectural limitations. 



2.1 80960 Architecture 

Embedded applications are cost sensitive, require a dif- 
ferent mix of instructions than reprogrammable appli- 
cations, have demanding interrupt response require- 
ments, and often use real-time executives rather than 
full-blown operating systems. The 80960 architecture 
was developed with these factors in mind. Several key 
optimizations which are provided by the architecture 
are explained below. 

Instruction Set: Powerful Boolean operations are pro- 
vided. Frequently executed functions are available as 
single instructions for greater code density and per- 
formance. Call, Return, Compare-and-Branch, Condi- 
tional-Compare, Compare-and-Increment or Decre- 
ment, and Bit-Field-Extract are each single instruc- 
tions. 

Interrupts: A priority interrupt structure simplifies the 
management of real-time events. With 31 discrete levels 
of priority and 248 possible interrupt-handling proce- 
dures, this structure provides the low latency and high 
throughput interrupt handling required in embedded 
processor applications. 
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Faults: A generalized fault-handling mechanism simpli- 
fies the task of detecting errant arithmetic calculations 
or other conditions that typically require a significant 
amount of in-line user code. 

Application-Specific Extensions: The core architecture 
is designed to accept application-specific extensions 
such as instruction set extensions (e.g., string functions, 
floating point), special purpose registers, larger caches, 
on-chip program and data memory, a memory manage- 
ment and protection unit, fault-tolerance support, mul- 
tiprocessing support, and real-time peripherals (DMA, 
serial ports, etc.). 


2.2 80960 C-Series Core 

The C-series core is an implementation of the 80960 
core architecture. The core can execute instructions at 
a sustained speed of 66 MIPS(i) with bursts of perform- 
ance up to 99 MIPS. To achieve this level of perform- 
ance, Intel has incorporated state-of-the-art silicon 
technology and innovative microarchitectural con- 
structs into the C-Series core. Factors which contribute 
to the core’s performance are listed below. 

— Parallel instruction decoding allows the 80960CA 
to start two instructions in every clock, with bursts 
of three instructions per clock. 

— Most instructions execute in a single clock cycle. 

— Multiple independent execution units enable over- 
lapping instruction execution. 

— Advanced silicon technology allows operation with 
a 33 MHz internal clock. 

— Efficient instruction pipeline is designed to mini- 
mize pipeline break losses. 

— Register and resource scoreboarding transparently 
manage parallel execution. 

— Branch look-ahead feature enables branches to exe- 
cute in parallel with other instructions. 

— Local register cache is integrated on-chip. 

— 1 Kbyte two-way set associative instruction cache is 
integrated on-chip. 

— 1 Kbyte Static Data RAM is integrated on-chip. 

These factors combine to make the 80960CA an ultra- 
high performance computing engine. 

NOTE: 

1. Single clock instructions at 33 MHz. 


2.3 80960CA System Peripherals 

The 80960CA features several extensions to the core 
architecture in the form of integrated peripherals. 
These peripherals are intended to reduce the external 
system requirements needed for embedded applications. 
These peripherals are described below. 


Bus Controller Unit: A 32-bit high-performance bus 
controller interfaces the 80960CA to external memory 
and peripherals. The bus controller transfers instruc- 
tions or data at a maximum rate of 132 Mbytes per 
second.( 2 ) Internally programmable wait states and 16 
separately configurable memory regions allow the bus 
controller to interface with a variety of memory subys- 
tems with minimum system complexity and maximum 
performance. 

DMA Controller: A four channel DMA controller per- 
forms high speed data transfers between peripherals 
and memory. The DMA controller provides advanced 
features such as data chaining, byte assembly and disas- 
sembly, and a fly-by mode capable of transfer speeds of 
up to 66 Mbytes per second. The DMA controller fea- 
tures a performance and flexibility which is only possi- 
ble by integrating the DMA controller and the 
80960CA core. 

Interrupt Controller: A priority interrupt controller 
manages 8 external interrupt inputs, 4 internal inter- 
rupt sources from the DMA controller, and a single 
non-maskable interrupt input (NMI). A total of 248 
external interrupt sources are supported by the inter- 
rupt controller by configuring the 8 external interrupt 
pins as an 8-bit input port. The interrupt controller pro- 
vides the mechanism for the low latency and high 
throughput interrupt service featured by the 80960CA. 
The interrupt latency for the 80960CA is typically less 
than 1 jLLS. 

3.0 EXECUTION ENVIRONMENT 

The Execution Environment (Figure 3-1) refers to the 
resources which are available for executing code on the 
80960CA. The following sections describe the elements 
of the execution environment. 


3.1 Registers and Literals 

The 80960CA provides four types of working data reg- 
isters: Global Registers, Local Registers, Special Func- 
tion Registers (SFRs), and Control Registers. 

Global and local registers are general purpose 32-bit 
data registers. The SFRs and the control registers pro- 
vide a programmer’s interface to the on-chip peripher- 
als (i.e., the DMA controller, interrupt controller, and 
bus controller). 

NOTE: 

2. 33 MHz internal clock, load or instruction fetch on 
0 wait state, pipelined burst bus. 
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Figure 3-1. Execution Environment 


The 80960 architecture is a register-oriented architec- 
ture. That is, operands and results of instructions are 
placed in working data registers rather than in memory. 
Since the architecture is register oriented, an ample 
supply of registers is provided. The architecture’s work- 
ing register set consists of 16, 32-bit global registers and 
16, 32-bit local registers. 

3.1.1 GLOBAL AND LOCAL REGISTERS 

The procedure call and return mechanism, which is 
part of the 80960 architecture, inspires the names given 
to the local and global registers. When a procedure call 
or return is executed, the contents of global registers 
are preserved across procedure boundaries. In other 
words, the same set of global registers is used for each 
procedure. A new set of local registers, however, is allo- 
cated for each procedure. The 80960’s call and return 
mechanism is explained in Section 3.8. 

The 80960CA supplies 16, 32-bit global registers desig- 
nated gO through gl5. Registers gO through gl4 are 
general purpose global registers. Register gl5 is re- 
served for the current Frame Pointer. This register is 
available in assembly language as the fp register. The fp 
contains the address of the first byte in the current 
stack frame. The fp register and the stack frame are 
described in Section 3.8. 


The 80960CA supplies 16, 32-bit Local Registers desig- 
nated rO through rl5. Registers r3 through rl5 are gen- 
eral purpose local registers. Registers rO, rl, and r2 are 
reserved for special functions as follows: rO contains the 
Previous Frame Pointer, rl contains the Stack Pointer, 
and r2 is reserved for the Return Instruction Pointer. 
These registers are available in assembly language as, 
respectively, the pfp, sp, and rip registers. The pfp, sp, 
and rip registers manage stack frame linkage for the 
80960’s procedure call and return mechanism. The 
function of these registers is decribed in Section 3.8. 

3.1.2 SPECIAL FUNCTION REGISTERS AND 
CONTROL REGISTERS 

The 80960CA uses 3 Special Function Registers (SFRs) 
for communicating with on-chip peripherals. These 
SFR’s are an architectural extension specific to the 
80960CA. The SFRs on the 80960CA are designated as 
sfO, sfl, and sf2. SFRs are accessed as source operands 
by most of the 80960CA’s instructions. The registers 
serve as part of the programmer’s interface to the 
DMA and interrupt controller. 
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Control registers, like SFRs are used to communicate 
with the on-chip peripherals. Configuration informa- 
tion for the peripherals is generally stored in these reg- 
isters. Control registers can only be accessed by using 
the system control (sysctl) instruction. The sysctl 
instruction is used to load the internal control register 
from a table in external memory called the control ta- 
ble. In order to simplify the process of peripheral con- 
figuration, the control registers are automatically load- 
ed from this table at initialization. 

3.1.3 LITERALS 


3.2 Address Space and Memory 

The address space of the 80960CA (Figure 3-2) is con- 
sidered a subset of the execution environment since the 
code, data, data structures, and external peripherals for 
the processor reside here. The 80960 family has an ad- 
dress space which is 232 bytes (4 Gbytes) in size. This 
address space is linear (unsegmented); therefore, code, 
data, and peripherals may be placed anywhere in the 
usable space. For the 80960CA, some memory loca- 
tions are reserved or are assigned special functions as 
shown in Figure 3-2. 


The 80960CA provides literals which may be used in 
the place of source register operands in most instruc- 
tions. The literals range from 0 to 31 (5 bits). When a 
literal is used as an operand, the processor expands it to 
32 bits by adding leading zeros. If the instruction de- 
fines an operand larger than 32 bits, the processor zero 
extends the literal to the operand size. 


3.2.1 INTERNAL DATA RAM 

The 80960CA provides 1 Kbyte of internal static RAM 
for fast access of frequently used data. The data RAM 
allows time critical data storage and retrieval, with no 
dependence on the performance of the external bus. 
Any load or store, including quad-word 


ADDRESS 



0000 OOOOH 


0 

0000 003FH 

Interrupt Vectors (optional) 
(Internal SRAM) 


0000 0040H 


64 


DMA Registers (optional) 
(Internal SRAM) 


0000 OOBFH 



0000 OOCOH 


192 


Data RAM (Internal SRAM, 
User Write Protected) 


0000 OOFFH 


0000 0100H 


256 


Data RAM (Internal SRAM, 



Programmable User 

Write Protection) 


0000 03FFH 



0000 0400H 


1024 

< 

Code/Data 

Architecturally 

^ Defined Data < 



Structures 
(External Memory) 


FEFF FFFFH 



FFOO OOOOH 



< 

^ Reserved 


FFFF FEFFH 



FFFF FFOOH 

Initialization Boot Record 


FFFF FF2CH 

(External Memory) 


FFFF FF2DH 



FFFF FFFFH 

Reserved 

2^2 -■1(4 Gbytes) 



270669-5 


Figure 3-2. Address Space 
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operations, execute in a single clock cycle when direct- 
ed to internal data RAM. The data RAM is located at 
address OOH in the processor’s address space. When the 
DMA controller is in use, 32 bytes of data RAM are 
reserved for each active DMA channel. Additionally, 
64 bytes of data RAM are reserved for 16 interrupt 
vectors which may be cached internally to reduce inter- 
rupt latency. The data RAM reserved for the DMA 
controller and the interrupt controller can be used for 
additional data storage when these peripherals are not 
used. 

Two execution modes are possible on the 80960CA, 
user mode or supervisor mode. These modes are used to 
implement a protection model in which system data 
structures are isolated from user code. As shown in 
Figure 3-2, the first 256 bytes of data RAM are always 
write protected when a program is executing in user 
mode but may always be written when executing in 
supervisor mode. The remainder of the data RAM can 
be programmed for this protection feature. The user 
and supervisor modes are described further in Section 
3.7. 

3.2.2 RESERVED ADDRESS SPACE 

The upper 16 Mbytes of memory (FFOOOOOOH- 
FFFFFFFFH) are reserved for specific functions and 
extensions to the 80960 architecture. The 12 words in 
reserved space (FFFFFF00H-FFFFFF2CH) are used 
to start up the processor when it comes out of reset. 
These 12 words are called the initialization boot record. 

3.2.3 ARCHITECTURALLY DEFINED DATA 
STRUCTURES 

To execute a program on the 80960CA, data structures 
specific to the 80960 architecture must reside in the 
processor’s address space. Architecture-defined data 
structures include stacks, initialization structures, and 
various procedure entry tables. These data structures 
may generally be located anywhere in the address 
space. Pointers to each data structure are specified 
when the 80960CA is initialized. The architecture-de- 
fined data structures include: 

— Interrupt Table “ User Stack 

— System-Procedure Interrupt Stack 

Table — Supervisor Stack 

— Fault Table 

In addition to the data structure defined by the archi- 
tecture, the 80960CA requires several implementation- 
specific data structures which are used for configuring 
peripherals and initialization. These data structures in- 
clude: 

— Control Table 

— Process Control Block 

— Initialization Boot Record 

Each data structure will be explained in more detail 
later in this product overview. 


3.3 Memory Addressing Modes 

The 80960CA offers a variety of modes for memory 
addressing. The addressing modes available are summa- 
rized in Table 3-1. 

Absolute addressing is used to reference an address as 
an offset from address 0 of the processor’s address 
space. At the machine level, absolute addressing may be 
implemented in one of two ways depending on the size 
of the absolute offset from address 0. Two instruction 
formats, MEMA and MEMB, are used to provide abso- 
lute addressing modes. For the MEMA format, the off- 
set is an ordinal number ranging from 0 to 2048. For 
the MEMB format, the offset is an integer (called a 
displacement) ranging from —231 — 1 to 2^1. An assem- 
bler will choose the MEMA or MEMB format based on 
the size of the offset. 

Register-indirect addressing modes use a 32-bit ordinal 
value in a register as the base for the address calcula- 
tion. Offsets and indexes are added to this address base 
depending on the particular addressing mode. The 
register-indirect-with-index addressing mode adds a 
scaled index to the address base. The index is specified 
as a value in a register. The scale value may be selected 
as 1, 2, 4, 8, or 16. 

The index-with-displacement addressing mode uses a 
scaled index plus an integer displacement. No address 
base is used in this address calculation. 

The IP-with-displacement addressing mode is used with 
load and store instructions to make them IP relative. In 
this mode, an integer displacement plus a constant of 8 
is added to the IP of the instruction to calculate the 
next address. 


Table 3-1. Memory Addressing Modes 


Mode 

Description 

Absolute Offset 

Offset 

Absolute Displacement 

Displacement 

Register Indirect 

Abase 

Register Indirect with 
Offset 

Abase + Offset 

Register Indirect with 
Index 

Abase + (lndex*Scale) 

Register Indirect with 
Index and Displacement 

Abase + (Index* Scale) 

+ Displacement 

Index with Displacement 

(lndex*Scale) + 
Displacement 

Register Indirect with 
Displacement 

Abase + Displacement 

IP with Displacement 

IP + Displacement + 8 
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3.4 Data Types 

The 80960CA operates on the following data types (Figure 3-3): 

— Integer (8, 16, 32, and 64 bits) 

— Ordinal (8, 16, 32, and 64 bits) 

— Bit 

— Bit Field 

— Triple Word (96 bits) 

— Quad Word (128 bits) 



lass Data Type Length Range 



Figure 3-3. Data Types 
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The following sections describe the data types support- 
ed by the 80960CA. 

3.4.1 NUMERIC DATA TYPES 

Integers and ordinals are considered numeric data 
types since the processor performs arithmetic opera- 
tions with this data. The integer data type is a signed 
binary value in standard 2’s complement representa- 
tion. The ordinal data type is an unsigned binary value. 

3.4.2 NON-NUMERIC DATA TYPES 

The remaining data types (bit field, triple word, and 
quad word) represent groupings of bits or bytes that the 
processor can operate on as a whole, regardless of the 
nature of the data contained in the group. These data 
types facilitate the moving of blocks of bits or bytes. 


3.5 Instruction Set 

The 80960CA features a comprehensive instruction set 
(Table 3-2). Much of the instruction set is that of a 
RISC architecture. Unlike pure RISC machines, how- 
ever, the 80960CA provides an extension to the RlSC 
instruction set with instructions that perform complex 
functions such as procedure calls and returns, high- 
speed multiplies, and other complex control, arithme- 
tic, and logical operations. The instruction set allows 
functionally complex yet highly compact code to be 
written for embedded control applications where mem- 
ory is a valuable commodity. 

3.5.1 INSTRUCTION GROUPS 

The 80960CA instruction set is most easily described if 
grouped by the functions listed below: 

— Data Movement 

— Address Computation 

— Logical and Arithmetic 

— Bit and Bit Field 

— Comparison 

— Branch 

— Call and Return 

— Fault 

— Debug 

— Processor Management 

The instructions which make up each of these groups 
are described in the following sections. 


3.5.1. 1 Data Movement Instructions 

The data movement instructions move data from mem- 
ory to registers, from registers to memory, and between 
registers. The load instructions copy bytes, words, or 


multiple words from memory to a selected register or 
group of registers. Conversely, the store instructions 
copy bytes, words, or groups of words from a selected 
register or group of registers to memory. The move in- 
structions copy data between registers. 

Load Instructions 

-Id 

load word 

-Idob 

load ordinal byte 

- Idos 

load ordinal short 

-Idib 

load integer byte 

- Idis 

load integer short 

-Idl 

load long 

-Idt 

load triple 

- Idq 

load quad 

Store Instructions 

-St 

store word 

-stob 

store ordinal byte 

-stos 

store ordinal short 

- stib 

store integer byte 

- stis 

store integer short 

-StI 

store long 

-stt 

store triple 

-stq 

store quad 

Move Instructions 

- mov 

move word 

- movi 

move long 

- movt 

move triple 

-movq 

move quad 


3.5.1.2 Address Computation Instructions 

The load address (Ida) instruction causes a 32-bit ad- 
dress to be computed and placed in a destination regis- 
ter. The address is computed based on the addressing 
mode selected. The load and store instructions perform 
a function identical to that of the Ida instruction when 
calculating a source or destination address. The Ida in- 
struction is useful for loading a 32-bit constant into a 
register. 

3.5.1. 3 Logical and Arithmetic Instructions 

Logical instructions perform bitwise Boolean opera- 
tions on operands in registers. Since this group of in- 
structions performs only bitwise manipulations of data, 
separate logical instructions for integer and ordinal 
data types do not exist. In the table below, srcl and 
src2 represent processor registers or literals which are 
the operands for these instructions. 
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Table 3-2. Instruction Set Summary 


Data 

Movement 

Arithmetic 

Logical 

Bit and 

Bit Field 

Load 

Store 

Move 

Add 

Subtract 

Multiply 

Divide 

Remainder 

Modulo 

Shift 

Extended 

Shift 

Extended 

Multiply 

Extended 

Divide 

Add with 

Carry 

Subtract with 

Carry 

And 

Not And 

And Not 

Or 

Exclusive Or 

Not Or 

Or Not 

Nor 

Exclusive Nor 

Not 

Nand 

Rotate 

Set Bit 

Clear Bit 

Not Bit 

Check Bit 

Alter Bit 

Scan for Bit 

Scan for Byte 

Span oyer Bit 

Extract 

Modify 

Comparison 

Branch 

Call and 

Return 

Fault 

Compare 

Condition 

Compare 

Compare and 
Increment 
Compare and 
Decrement 
Condition Test 

Unconditional 

Branch 

Conditional 

Branch 

Branch and 

Link 

Condition 

Compare 

and Conditional 

Branch 

Call 

Call Extended 

Call System 

Return 

Conditional 

Fault 

Synchronize 

Faults 

Debug 

Processor 

Management 

Address 

Computation 

Atomic 

Modify Trace 
Controls 

Mark 

Force Mark 

Modify I 

Process 

Controls 

Modify 

Arithmetic 

Controls 

System Control 

Update DMA 

Setup DMA 

Flush Local 

Registers 

Load Address 

Atomic Add 

Atomic Modify 
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Logical Instructions 


-and 

src1 and src2 

- notand 

srcl and (not src2) 

- andnot 

(not srcl) and src2 

-or 

src1 or src2 

- notor 

srcl or (not src2) 

- ornot 

(not srcl) or src2 

- xor 

srcl xor src2 

- xnor 

srcl xnor src2 

-nor 

not (srcl or src2) 

- nand 

not (srcl and src2) 

- not 

not(src1) 


Arithmetic instructions perform add, subtract, multi- 
ply, divide, and shift operations on integer or ordinal 
operands in registers. 

Arithmetic Instructions 


-addi 

add integer 

-addo 

add ordinal 

- subi 

subtract integer 

- subo 

subtract ordinal 

-mull 

multiply integer 

- mulo 

multiply ordinal 

-divi 

divide integer 

-divo 

divide ordinal 

- remi 

remainder integer 

-remo 

remainder ordinal 

- modi 

modulo integer 

- rotate 

rotate bit left 

-shli 

shift left integer 

- shio 

shift left ordinal 

-shri 

shift right integer 

-shro 

shift right ordinal 

- shrdi 

shift right dividing integer 


Extended arithmetic instructions facilitate computation 
on ordinals and integers which are longer than 32 bits. 
In add with carry and subtract with carry instructions, 
the carry out from the previous arithmetic instruction 
is used in the computation. The extended multiply in- 
struction multiplies two ordinal source operands pro- 
ducing a long ordinal result (64 bits). The extended 
divide instruction divides a long ordinal dividend by an 
ordinal divisor and produces a 64-bit result. The ex- 
tended shift right instruction shifts a 64-bit source val- 
ue and produces the lower order 32 bits of the shifted 
value. 


The atomic instructions perform read-modify-write op- 
erations on operands in memory. They allow a system 
to insure that when an atomic operation is performed 
on a specified memory location, the operation will be 
completed before another agent is allowed to perform 
an operation on the same memory. These instructions 
are required to enable synchronization between inter- 
rupt handlers and background tasks in any system. 
They are also particularly useful in systems where sev- 
eral agents (processors, coprocessors, or external logic) 
have access to the same system memory for communi- 
cation. 

Atomic instructions 

- atadd atomic add 

- atmod atomic modify 

3.5. 1.4 Bit and Bit Field Instructions 

The bit instructions operate on a specified bit in a regis- 
ter. 

Bit Instructions 


- setbit 

set bit 

- cirbit 

clear bit 

- notbit 

not bit 

- alterbit 

alter bit 

- scanbit 

scan for bit 

- spanbit 

span over bit 


Bit field instructions operate on a specified contiguous 
group of bits in a register. This group of bits can be 
from 0 to 32 bits in length. 

Bit Field Instructions 

- extract extract field 

- modify modify field 

- scanbyte scan for byte 

3.5. 1.5 Branch Instructions 

The branch instructions allow the direction of program 
flow to be changed by explicitly modifying the 
Instruction Pointer (IP). The target IP in a branch in- 
struction is generally specified as a displacement to be 
added to the current IP. The extended branch instruc- 
tions allow IP calculation using any addressing mode. 

The unconditional branch instructions always alter pro- 
gram flow when executed. 


Extended Arithmetic Instructions 

- addc add ordinal with carry 

- subc subtract ordinal with carry 

- emul extended multiply 

- ediv extended divide 

- eshro shift right extended ordinal 


Unconditional Branch 
Instructions 

- b branch 

- bx branch extended 

The RISC branch-and-link instructions automatically 
save a Return Instruction Pointer (RIP) before the 
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jump is taken. The RIP is the address of the instruction 
following the branch and link. 

Branch and Link instructions 

- bal branch and link 

- balx branch and link extended 

Conditional branch instructions alter program flow 
only if the condition code flags in the arithmetic control 
register match a value specified in the instruction. The 
condition code flags indicate conditions of equality or 
inequality between two operands in a previously execut- 
ed instruction. The arithmetic control register and con- 
dition code flags are described in Section 3.6. 

Based on a branch prediction flag located in the ma- 
chine level instruction, the 80960CA will assume that 
an instruction usually takes or does not take a condi- 
tional branch. By executing along the predicted path of 
program flow, delays due to breaks in the instruction 
stream are often avoided. This feature of the 80960CA 
is referred to as branch prediction. The 80960CA incor- 
porates the branch prediction feature because code us- 
ing a conditional branch instruction usually favors a 
single direction of program flow. 

The branch prediction flag is specified at the assembly 
level by appending a . r or ./ to a conditional branch 
instruction meaning, respectively, “assume branch tak- 
en” or “assume branch not taken”. For example, the 
assembler mnemonic be.t means that the processor will 
assume that this branch-if-equal instruction usually 
branches when encountered. In the following table .p 
represents the branch prediction flag. 

Conditional Branch Instructions 


-be.p 

branch if equal 

- bne.p 

branch if not equal 

-bl.p 

branch if less 

- ble.p 

branch if less or equal 

-bg.p 

branch if greater 

- bge.p 

branch if greater or equal 

- bp.p 

branch if ordered 

- bno.p 

branch if unordered 


Compare and conditional branch instructions compare 
two operands, then branch according to the immediate 
results. 


Conditional Compare and 
Conditions Branch Instructions 

- cmpibe.p compare integer 

and branch if 
equal 

- cmpibne.p compare integer 

and branch if 
not equal 


- cmpibl.p 

compare integer 
and branch if less 


- cmpibie.p 

compare integer 
and branch if less 
or equal 


- cmpibg.p 

compare Integer 
and branch if 
greater 


- cmpibge.p 

compare integer 
and branch if 
greater or equal 


- cmpibo.p 

compare integer 
and branch if 
ordered 


- cmpibno.p 

compare integer 
and branch if 
unordered 


- cmpobe.p 

- cmpobne.p 

- cmpobl.p 

compare ordinal 
and branch if 
equal 

compare ordinal 
and branch if 
not equal 
compare ordinal 
and branch if less 


3 


- cmpobie.p 

compare ordinal 
and branch if less 
or equal 


- cmpobg.p 

compare ordinal 
and branch If 
greater 


- cmpobge.p 

compare ordinal 
and branch if 
greater or equal 


- bbs.p 

check bit 
and branch 
if set 


- bbc.p 

check bit 
and branch 
if clear 



3.5. 1.6 Compare and Condition Test 
Instructions 

The 80960CA provides several types of instructions 
that are used to compare two operands. The condition 
code flags in the arithmetic control register are set to 
indicate whether one operand is less than, equal to, or 
greater than the other operand. 

Compare Instructions 

- cmpi compare integer 

- cmpo compare ordinal 

- chkbit check bit 

Conditional compare instructions test the existing 
status of the condition code flags before a compare is 
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performed. These conditional compare instructions are 
provided to optimize two-sided range comparisons (i.e. 
to test if a value is less than one number but greater 
than another). 

Conditional Compare Instructions 

- concmpi conditional compare integer 

- concmpo conditional compare ordinal 

The compare and increment and compare and decre- 
ment instructions set the condition code flags based on 
a comparison of two register sources, decrements or 
increments one of the sources, and finally stores this 
result in a destination register. 


- cmpinci 

- cmpinco 

- cmpdeci 

- cmpdeco 


compare and increment integer 
compare and increment ordinal 
compare and decrement integer 
compare and decrement ordinal 


The condition test instructions allow the state of the 
condition code flags to be tested. Based on the outcome 
of the comparison, a true or false code is stored in a 
destination register. The branch prediction flag is used 
in this instruction to reduce the execution time of the 
instruction when the test outcome is predicted correct- 
ly. For example teste.t (test if equal) will execute in a 
shorter time if the condition code flags test true for the 
equal condition. Analogous to the function of the 
branch prediction flag in the conditional compare and 
branch instructions, the prediction flag in this case 
eliminates breaks in the micro-instruction sequence 
which is used to implement the condition test instruc- 
tions. 


Condition Test Instructions 


- teste.p 

test if equal 

- testne.p 

test if not equal 

- testl.p 

test If less 

- testle.p 

test if less or equal 

- testg.p 

test If greater 

- testge.p 

test if greater or equal 

- testo.p 

test if ordered 

- testno.p 

test if not ordered 


3.5.1.7 Call and Return Instructions 

The 80960CA features an on-chip call and return 
mechanism for making procedure calls to local and sys- 
tem procedures. The call instructions and the call and 
return mechanism is described in Section 3.8. 

Call and Return Instructions 

- call call 

- callx call extended 

- calls call system 

- ret return 


3.5. 1.8 Fault Instructions 

The 80960CA will fault automatically as the result of 
certain errant operations which may occur when exe- 
cuting code. Fault procedures are then invoked auto- 
matically to handle the various types of faults. In addi- 
tion, the fault instructions permit a fault to be generat- 
ed explicitly based on the value of the condition code 
flags. The branch prediction flag in these instructions is 
used to reduce the execution time of these instructions 
when the state of the condition code flags are guessed 
correctly. 

Conditional Fault Instructions 


- faulte.p 

fault if equal 

- faultne.p 

fault if not equal 

- faulti.p 

fault if less 

- faultie.p 

fault if less or equal 

- fauitg.p 

fault if greater 

- fauitge.p 

fault If greater or equal 

- faulto.p 

fault if ordered 

- faultno.p 

fault if unordered 


The syncf instruction causes the processor to wait for 
all faults to be generated which are associated with any 
prior uncompleted instructions. 

- syncf synchronize faults 

3.5.1. 9 Debug Instructions 

The processor supports debugging and monitoring of 
program activity through the use of trace events. The 
debug instructions support debugging and monitoring 
software. 


Debug Instructions 

- modtc modify trace controls 

- mark mark 

- fmark force mark 


3.5.1.10 Processor Management Instructions 

The 80960CA provides several instructions for direct 
control of processor functions and for configuring the 
80960CA’s peripherals. A brief description of the proc- 
essor management instructions is given below. 


Processor Management Instructions 


-modpc 

- modac 

- syscti 

- udma 
-sdma 

- flushreg 


modify process controls 
modify arithmetic controls 
system control instruction 
update DMA SRAM 
setup DMA 
flush local registers 
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3.6 Arithmetic Controls 

The Arithmetic Control (AC) Register is a 32-bit on-chip 
register (Figure 3-4). The AC register is used primarily 
to monitor and control the execution of 80960CA arith- 
metic instructions. The processor reads and modifies 
bits in the AC register when performing many arithme- 
tic operations. The AC register is also used to control 
the faulting conditions for some instructions. The 
modac instruction allows the user to directly read or 
modify the AC register. 

The processor sets the condition code flags (bits 0-2) to 
indicate equality or inequality as the result of certain 
instructions (such as the compare instructions). Other 
instructions, such as the conditional branch instruc- 
tions, take action based on the value of the condition 
code flags. Table 3-3 shows the functional assignment 
for each condition code flag. 


Table 3-3. Arithmetic Condition Codes 


Condition 

Code 

Condition 

001 

Greater Than 

010 

Equal 

100 

Less Than 


The integer overflow flag (bit 8) and the integer over- 
flow mask (bit 12) are used in conjunction with the 
arithmetic integer overflow fault. The mask bit masks 
the integer overflow fault. When the fault is masked, 
and an integer overflow occurs, the integer overflow 
flag is set but no fault handling action is taken. If the 
fault is not masked, and an integer overflow occurs, the 
integer overflow fault is taken and the integer overflow 
flag is not set. 

The no imprecise faults flag (bit 15) determines if im- 
precise faults are allowed to occur. Fault handling and 
precise and imprecise faults in the 80960CA are dis- 
cussed in Section 3.10. 


3.7 Process Management 

Process management refers to the monitoring and con- 
trol of certain properties of an executing process. The 
following sections describe the mechanisms available on 
the 80960CA to perform this function. 
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Figure 3-4. Arithmetic Controi Register 
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3.7.1 PROCESS CONTROL REGISTER 

The Process Control (PC) Register (Figure 3-5) provides 
access to process state information. The function for 
the PC register is described below. 

Execution Mode FVcg— This flag indicates that the 
processor is executing in user mode (0) or supervisor 
mode (1). 

Priority Field — This 5 -bit field indicates the current ex- 
ecuting priority of the processor. Priority values range 
from 0 to 31, with 0 as the lowest and 31 as the highest 
priority. , 

State Flag — This flag determines the executing state of 
the processor. The processor state is either executing 
state (0) or interrupted state (1). 

Trace Enable Bit and Trace Fault Pending Flags — 
These fields control and monitor trace activity in the 
processor. The Trace Enable Bit enables fault genera- 
tion for trace events. The Trace Fault Pending Flag 
indicates that a trace event has been detected. 


ty of the processor is restored to its priority before the 
interrupt occurred. 

3.7.3 PROCESSOR STAJES AND MODES 

The 80960CA may execute programs in user mode or 
supervisor mode. The user-supervisor protection mecha- 
nism allows a system to be designed in which kernel 
code and data reside in the same address space as user 
code and data, but access to the kernel procedures and 
data is only allowed through a tightly controlled inter- 
face. This interface is the system call table and the in- 
terrupt mecha nism. The 80960CA provides a supervi- 
sor pin (SUP) to implement memory systems which 
protect code and data from possible corruption by pro- 
grams executing in user mode. Some instructions and 
functions of the 80960CA are also insulated from code 
executing in user mode. 

The processor has two operating states: executing and 
interrupted. In executing state, the processor can exe- 
cute in user or supervisor mode. In the interrupted 
state, the processor always executes in supervisor mode. 


The process controls can be modified by software with 
the modify process controls (modpc) instruction. The 
modpc instruction may only write the PC register when 
the processor is in supervisor mode. 

3.7.2 PRIORITIES 

The 80960 architecture defines a means to assign priori- 
ties to executing programs and interrupts. The current 
priority of the processor is stored in the priority field of 
the PC register. This priority is used to determine if an 
interrupt will be serviced and in which order multiple 
pending interrupts will be serviced. Setting the priority 
of an executing program above that of interrupts allows 
critical code to be prioritized and executed without in- 
terruption. 

The priority field of the PC register can be modified 
directly using the modpc instruction. The priority field 
is also modified to reflect the priority of serviced inter- 
rupts. On a return from an interrupt routine, the priori- 


3.8 Call and Return Mechanism 

The 80960 architecture features a built-in call and re- 
turn mechanism. This mechanism is designed to make 
procedure calls simple and fast, and to provide a flex- 
ible method for storing and handling variables that are 
local to a procedure. A call automatically allocates a 
new set of local registers and a new stack frame. All 
linkage information is maintained by the processor, 
making procedure calls and returns virtually transpar- 
ent to the user. A system call instruction is provided as 
a method for calling privileged procedures such as a 
kernel service. The call and return model supports effi- 
cient translation of structured high level code (such as 
C, or ADA) to 80960 machine language. 

The procedure call and return mechanism provides a 
number of significant benefits which contribute to the 
performance and ease of use of the 80960CA. 

1) The call and return instructions are implemented en- 
tirely on-chip, resulting in an extremely high per- 
formance implementation of these commonly used 
functions. 
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Figure 3-5. Process Control Register 

3-142 




80960CA PRODUCT OVERVIEW 




iny. 


2) A single instruction to implement each call or return 
operation results in code density improvements com- 
pared to processors which require multiple instruc- 
tions to encode these functions. 

3) By implementing the call and return functions as 
single instructions, the 80960 architecture is open for 
further optimization of these instructions, while 
maintaining assembly-level compatibility, 

4) A program does not have to explicitly save or restore 
the variables stored in the local registers when a call 
or return is executed. The processor does this implic- 
itly on procedure calls and on returns. 

5) The call and return mechanism provides a structure 
for storing a virtually unlimited number of local 
variables for each procedure: the on-chip local regis- 
ters provide quick access to often used variables and 
the stack provides space for additional variables. 

3,8.1 LOCAL REGISTERS AND THE STACK 
FRAME 

At any point in a program, the 80960 has access to a 
local register set and a section of the procedure stack 
referred to as a stack frame. When a call is executed, a 
new stack frame is allocated for the called procedure. 
Additionally, the current local register set is saved by 
the processor, freeing these registers for use by the new- 
ly called procedure. In this way, every procedure has a 
unique stack and unique set of local registers. When a 


return is executed, the current local register set and 
current stack frame are deallocated. The previous local 
register set and previous stack frame are restored. This 
call and return mechanism is illustrated in Figure 3-6 
where n is procedure depth for the currently executing 
procedure. 

The procedure stack structure is defined by the 80960 
architecture. The procedure stack always grows up- 
ward (i.e. towards higher addresses) and the stack 
pointer (SP) always points to the next available byte of 
the stack frame. The 80960CA requires that each stack 
frame begins on a 1 6-byte boundary. Due to this align- 
ment requirement, a padding space of 0 to 15 bytes may 
exist between adjacent stack frames in memory. When 
a stack frame is allocated, the first 16 words are always 
assigned as storage for the local registers; therefore, the 
SP initially points to the 17th word in the stack frame. 
It should be noted that although each stack frame is 
assigned storage space for the local registers, these loca- 
tions in the stack are not guaranteed to contain the 
values of the saved local registers. This is because sever- 
al sets of local registers are cached on-chip rather than 
written to the stack in external memory. This caching 
mechanism is described in detail later in this section. 

3.8.2 PROCEDURE LINKING 

The 80960 architecture automatically manages proce- 
dure linkage. One global register and three local regis- 
ters are reserved for procedure linkage information. 
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Figure 3-6. Call and Return Mechanism 
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Figure 3-7 describes the pointer structure used to link 
frames and to provide a unique SP for each frame. Reg- 
ister gl5 is the Frame Pointer (FP). The FP is the ad- 
dress of the first byte of the current (topmost) stack 
frame. The FP is always updated to point to the current 
frame when calls and returns are executed. Register rO 
is the Previous Frame Pointer (PFP). The PFP is the 
address of the first byte of the stack frame which was 
created prior to the frame containing this PFP. Register 
rl is the Stack Pointer (SP). The SP points to the next 
available byte of the stack frame. Register r2 is reserved 
for the Return Instruction Pointer (RIP). The RIP is 
the address of the instruction which follows a call in- 
struction, this is also the target address for the return 
from that procedure. The RIP is automatically stored 
in register r2 of the calling procedure when a call is 
executed. 

3.8.3 PARAMETER PASSING 

Parameters may be passed by value or passed by refer- 
ence between procedures. The global registers, the 
stack, or predefined data structures in memory may be 
used to pass these parameters. 


The global registers provide the fastest method for pass- 
ing parameters. The values to be passed into a proce- 
dure reside in the global registers of the calling proce- 
dure. When a procedure is called, the values in the 
global registers are preserved. If more parameters are to 
be passed than will fit in the global registers, additional 
parameters may be passed in the stack of the calling 
procedure, or in a data structure which is referenced by 
a pointer passed in the global registers. 

3.8.4 LOCAL REGISTER CACHE 

The 80960CA provides an on-chip cache for saving and 
restoring the local registers on calls and returns. This 
cache greatly enhances performance of the call and re- 
turn mechanism on the 80960CA. Movement of data 
between the local registers and the register cache is typ- 
ically accomplished in only 4 processor clocks with no 
external bus traffic. When this cache is filled, the regis- 
ters associated with the oldest stack frame are moved to 
the area reserved for those registers on the physical 
stack (Figure 3-7). 


STACK 
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Figure 3-7. Stack Frame Linkage 
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The local register cache is a physical extension of the 
internal data RAM. The part of the data RAM used for 
this cache is not visible to the user and is large enough 
to hold up to 5 sets of local registers. The register cache 
may be extended to hold up to 15 sets of local registers. 
When extended, each new register set consumes 16 
words of the user’s data RAM, beginning at the highest 
address and growing downward. The size of the local 
register cache is selected when the processor is initial- 
ized. 

In some cases, the contents of the cached local register 
sets may require examination or modification (e.g. for 
fault handling). Since the local registers are cached, the 
flushreg instruction is provided to flush the local regis- 
ter cache to the locations reserved for the registers on 
the stack. This insures that the values in external mem- 
ory are consistent with the values held in the local reg- 
ister cache. 


3.8.5 LOCAL AND SYSTEM CALLS 

The 80960CA provides two methods for making proce- 
dure calls: local calls and system calls. Local and sys- 
tem calls differ in their operation and use in an applica- 
tion. 


The local call instructions initiate a procedure call us- 
ing the call and return mechanism described earlier. 
The stack frames for these procedure calls are allocated 
on the local procedure stack. A local call is made using 
either of two local call instructions: call or callx. The 
call instruction specifies the address of the called proce- 
dure using an IP plus displacement addressing mode 
with a range of — 2^3 to 2^3 — 4 bytes from the current 
IP. The callx (call extended) instruction specifies the 
address of the calling procedure using any of the 
80960’s addressing modes. 


A system call is made using the calls instruction. This 
call is similar to a local call except that the processor 
gets the IP for the called procedure from a data struc- 
ture called the system procedure table. The calls in- 
struction requires a procedure number operand. This 
procedure number serves as an index into the system 
procedure table, which contains IP’s for specific proce- 
dures. The system procedure table is shown in Figure 
3-8. 


The system call mechanism supports two types of pro- 
cedure calls: system-local calls and system-supervisor 
calls (also referred to as supervisor calls). The system- 
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local call performs the same action as the local call 
instructions with one exception: the IP target for a sys- 
tem-local call is fetched from the system-procedure ta- 
ble. The supervisor call differs from the local call as 
follows: 

1) A supervisor call causes the processor to switch to 
another stack (called the supervisor stack). 

2) A supervisor call causes the processor to switch to 
the supervisor execut ion mode and asserts the 
80960CA’s supervisor (SUP) pin for all bus accesses. 

The system call mechanism offers several benefits. The 
system call promotes the portability of application soft- 
ware. System calls are commonly used for kernel serv- 
ices. By calling these services with a procedure number 
rather than a specific IP, application software does not 
have to be changed each time the implementation of the 
kernel service is modified. Additionally, the ability to 
switch to a different execution mode and stack allows 
kernel procedures and data to be insulated from appli- 
cation code. 

3.8.6 IMPLICIT PROCEDURE CALLS 

The call and return mechanism described for procedure 
calls applies to several classes of call instructions as 
well as to the context switching initiated by interrupts 
and faults. When an interrupt or fault condition occurs, 
an implicit call is performed that saves the current state 
of the processor before branching to the interrupt or 
fault handling procedure. When this context switch oc- 
curs, the local registers are saved and a new stack frame 
is allocated. Additionally, the values of the AC register 
and PC register are saved when the implicit call occurs. 
These values are restored on the return from the inter- 
rupt or fault handler. 


3.9 Interrupts 

An interrupt is a temporary break in the control stream 
of a program so that the processor can handle another 
task. Interrupts may be triggered by the instruction 
stream or by hardware sources internal and external to 
the 80960CA. An interrupt request is associated with a 
vector (i.e. an address) of an interrupt handling proce- 
dure. The processor will branch to the handling proce- 
dure when an interrupt is serviced. When the handling 
action is completed, the processor is restored to its state 
prior to the interrupt, 

3.9.1 INTERRUPT VECTORS AND PRIORITY 

Interrupt vectors are simply instruction pointers (ad- 
dresses) to interrupt handling procedures. The 80960 
architecture defines 248 interrupt vectors. This means 


that 248 unique interrupt handling procedures may be 
used. An 8 -bit interrupt vector number is associated 
with each interrupt vector. This number ranges from 8 
to 255. Each interrupt vector has a priority from 1 to 
31, which is determined by the 5 most significant bits of 
the interrupt vector number. Priority 1 is the lowest 
priority and 3 1 is the highest. Priority 0 interrupts are 
not defined. 

The 80960CA executes with a unique priority ranging 
from 0 to 31. When an interrupt is serviced, the proces- 
sor’s priority switches to the priority corresponding to 
that of the interrupt request. When a return from an 
interrupt procedure is executed, the process priority is 
restored to its value prior to servicing the interrupt. 
This priority switching is handled automatically by the 
80960CA. 

The 80960CA compares its current priority and the pri- 
ority of an interrupt request to determine whether to 
service an interrupt immediately or to delay service. If a 
requested interrupt priority is greater than the proces- 
sor’s current priority or equal to 31, the processor serv- 
ices the interrupt immediately; otherwise, the processor 
saves (posts) the interrupt request as a pending inter- 
rupt so that it can be serviced later. When the proces- 
sor’s priority falls below the priority of a pending inter- 
rupt, the pending interrupt is serviced. With the mecha- 
nism described, interrupts with a priority of 0 will nev- 
er be serviced. For this reason, vectors numbered 0 to 7 
are not defined. 

3.9.2 INTERRUPT TABLE 

The interrupt table (Figure 3-9) is an architecturally 
defined data structure which holds the interrupt vectors 
and information on pending interrupts. The first 36 
bytes of the table are used to post interrupts. The 31 
most significant bits in the 32-bit pending priorities 
field represent a possible priority (1 to 31) of a pending 
interrupt. When the processor posts an interrupt in the 
interrupt table, the bit corresponding to the interrupt’s 
priority is set. For example, if an interrupt with a prior- 
ity of 10 is posted in the interrupt table, bit 10 is set in 
the pending priorities field. 

The pending interrupts field contains a 256-bit string in 
which each bit represents an interrupt vector. When the 
processor posts an interrupt in the interrupt table, the 
bit corresponding to the vector number of that inter- 
rupt is set. 

Portions of the interrupt table are cached on-chip in a 
non-transparent fashion. This caching is implemented 
to minimized interrupt latency by reducing the number 
of accesses to the table in external memory when an 
interrupt is serviced. 
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3.9.3 INTERRUPT STACK 

Stack frames for interrupt handling procedures are allo- 
cated on a separate interrupt stack. The interrupt stack 
can be located anywhere in the processor’s address 
space. The beginning address of the interrupt stack is 
specified when the processor is initialized. 


3.9.4 INTERRUPT HANDLING ACTION 

When an interrupt is serviced, the processor saves the 
processor state and calls the interrupt procedure. The 
processor state is restored upon return from the inter- 
rupt procedure. 

This interrupt service mechanisni is handled by an im- 
plicit call operation. When the interrupt is serviced, the 
current local registers are saved. A new local register 
set and stack frame are allocated on the interrupt stack 
for the interrupt handler procedure and the processor 
switches to supervisor execution mode. In addition to 
the local registers, the current value of the AC and PC 
registers are saved as an interrupt record on the inter- 
rupt stack. 

3.9.5 PENDING INTERRUPTS 

Any of the 248 interrupts can be requested by software. 
The system control instruction (sysctl) is provided to 
support this feature. When the system control instruc- 
tion requests an interrupt, one of two actions may oc- 
cur depending on the priority of the requested interrupt 


and the current process priority. 1) The interrupt is 
serviced immediately, or 2) the interrupt is posted (the 
pending priorities field and the pending interrupts field 
are modified to reflect a pending interrupt). 

Interrupts may also be requested by hardware sources 
internal and external to the 80960CA. Managing the 
hardware sources and posting these interrupts is han- 
dled by the interrupt controller. Interrupts requested by 
hardware are posted in an internal register, not in the 
interrupt table. A mask register enables or disables in- 
terrupts from each hardware source. Requesting and 
posting hardware interrupts is described in Section 4.4 
Interrupt Controller. 


3.9.6 INTERRUPT LATENCY 


The time required to perform an interrupt task switch 
is referred to as the interrupt latency. The latency is the 
time measured between the activation of an interrupt 
source and the execution of the first instruction for the 
interrupt-handling procedure for the source. 

Interrupt latency for the 80960CA varies depending on 
conditions such as: 



— Complex instructions are executing when the inter- 
rupt occurs (e.g. sysctl, call, ret, etc.). 

— Outstanding loads to a local register are pending, 
delaying the interrupt context switch. 

— Division, multiplication, or other multi-cycle in- 
structions with a local register as destination are 
executing. 


The 80960CA has been designed to optimize latency 
and throughput for interrupts. Two processor features 
are designed for this purpose: 

First, in the interrupt table, all interrupt vectors with 
an index whose least significant four bits are OOIO 2 can 
be cached in internal data RAM. The processor will 
automatically read these vectors from data RAM when 
the interrupt is serviced. This feature reduces the added 
latency due to an external access of the interrupt table 
for that vector. The NMI vector is always cached in 
data RAM. 


Second, an instruction cache locking mechanism allows 
interrupt procedures or segments of interrupt proce- 
dures to be stored in the instruction cache. These rou- 
tines are always executed from the internal cache, elim- 
inating external code fetches and reducing latency and 
increasing throughput for the interrupt. 
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3.10 Fault Handling and Instruction 
Tracing 

The 80960CA is able to detect various conditions in 
code or in its internal state that could cause the proces- 
sor to deliver incorrect or inappropriate results or that 
could cause it to head down an undesirable control 
path. These conditions are referred to as faults. The 
80960 architecture provides fault handling mechanisms 
to detect and, in most cases, fully recover from a fault. 

The 80960CA provides on-chip debug support by trig- 
gering trace events and servicing the trace fault. A trace 
event is activated when a particular instruction or type 
of instruction is encountered in an instruction stream. 
The trace event optionally signals a fault, A fault han- 
dling procedure for the trace fault can act as a debug 
monitor and analyze the state of the processor when the 
trace event occurred. 

3.10.1 FAULT TYPES AND SUBTYPES 

All of the faults that the processor detects are pre- 
defined. These faults are divided into types and sub- 
types, each of which is given a number. Table 3-4 lists 
the faults that the processor detects arranged by type 
and subtype. 


Table 3-4. Fault Types and Subtypes 


Fault Type 

Fault Subtype 

Fault Record 

Parallel 


xxoo ooxx 

Trace 

Instruction Type 

XX01 0002 


Branch Trace 

XX01 0004 


Call Trace 

XX01 0008 


Return Trace 

XX01 0010 


Prereturn Trace 

XX01 0020 


Supervisor Trace 

XX01 0040 


Breakpoint Trace 

XX01 0080 

Operation 

Invalid Opcode 

XX02 0001 


Unimplemented 

X002 0002 


Invalid Operand 

XX02 0004 

Arithmetic 

Integer Overflow 

XX03 0001 


Arithmetic Zero-Divide 

XX03 0002 

Constraint 

Range 

XX05 0001 


Privileged 

XX05 0002 

Protection 

Length 

XX07 0001 

Type 

Mismatch 

XXOA 0001 


NOTE: X refers to preserved locations in the fault record. 


Parallel Fault Entry 

0 

3 Local Procedure Fault Table Entry 

Trace Fault Entry 

Operation Fault Entry 

16 

01 1 u 

Arithmetic Fault Entry 

24 

Address |0|0 




Constraint Fault Entry 

40 

System Procedure Table Fault Table Entry 
31 1 0 


Protection Fault Entry 

56 

Index |l|0 



Set to 0000027 Fi6 

Type Fault Entry 

80 



L 

252 


Reserved 
(Initialize to 0) 
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Figure 3-10. Fault Table 


3-148 






80960CA PRODUCT OVERVIEW 


inl^. 


31 0 

0 
4 
8 
12 


270669-14 

Figure 3-11. Fault Record 


Process Controls 

Arithmetic Control 

Address of Faulting Instruction 


rm Reserved 


3.10.2 FAULT TABLE 

The fault table (Figure 3-10) provides the processor 
with a pathway to fault handling procedures. The fault 
table is an architecture-defined data structure, which 
may be located anywhere in the processor’s address 
space. The location of the fault table is specified at ini- 
tialization. When a fault occurs, an entry in the table is 
selected based on the type of fault that occurs. The 
entry in the fault table contains a pointer to a specific 
fault handler. 

The fault table can contain two types of entries (Figure 
3-10). The first type of entry is simply a pointer to the 
address of the fault-handling procedure. The second 
type of entry is an index into the system-procedure ta- 
ble. Fault-handling procedures accessed through the 
system-procedure table may be executed in user or su- 
pervisor execution mode. 


3.10.3 FAULT HANDLING ACTION 

When a fault occurs, the processor performs an implicit 
call operation to the procedure specified in the fault 
table. In addition to performing the implicit call opera- 
tion, the processor creates a fault record in its newly 
allocated stack frame. This fault record contains infor- 
mation on the state of the processor when the fault 
occurred and the fault type and subtype (Figure 3-11). 

Some faults can be recovered from easily. When recov- 
ery from a fault is possible, the processor’s fault han- 
dling mechanism allows the processor to automatically 
resume work where the fault was signalled. The re- 
sumption action is initiated with the ret instruction. If 
simple recovery from a fault is not possible, then the 
fault handling procedure may call a debug monitor, ini- 
tiate a reset, or take other actions to recover from the 
fault. 


3.10.4 TRACING AND DEBUG 

The 80960CA provides a facility for monitoring the ac- 
tivity of the processor by tracing the instruction stream. 
A trace event occurs at points in a program where cer- 
tain types of instructions are encountered or a certain 


IP or data address is encountered. When a trace event 
occurs, a trace fault can be generated and a trace-fault 
handler called which displays or analyzes the state of 
the processor. 


3.10.4.1 Trace Events 


The Trace Control (TC) Register (Figure 3-12) is used 
to specify the types of instructions which cause trace 
events. When a mode bit in the TC register is set, spe- 
cific instructions will generate trace events. For exam- 
ple, if the branch trace mode bit is enabled and a 
branch instruction is executed, a branch trace event will 
be signalled. An event flag is used to record trace 
events. A single event flag is provided for each mode 
bit. Any trace event generates a trace fault when the 
trace enable bit in the process control register is set. 



The 80960CA recognizes 7 trace events. These events 
are described below. 


Instruction Trace Event — Signalled each time an in- 
struction is executed. This trace event can be used with 
a debug monitor to single step the processor. 

Branch Trace Event — Signalled each time a branch in- 
struction is executed. For conditional branch instruc- 
tions, this event is only signalled when the branch is 
taken. Branch-and-link, call, and return instructions do 
not signal this trace event. 

Call Trace Event — Signalled each time a branch-and- 
link or call instruction is executed. Implicit calls, such 
as those used in interrupt or fault handling, signal this 
event. When a call trace event occurs, the prereturn 
trace flag (bit 3 in local register rO) is set by the proces- 
sor to indicate a prereturn trace pending. 

Pre-Return Trace Event — Signalled just prior to any ret 
instruction. This event is only signalled if the pre-return 
trace flag in register rO is set. Since the pre-return trace 
flag is set when a call trace event occurs, the call trace 
mode must be enabled before a pre-return trace event 
can be signalled. 

Return Trace Event — Signalled each time a ret instruc- 
tion is executed. 
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Instruction Trace Mode 
Branch Trace Mode 
Call Trace Mode 
Return Trace Mode 
Prereturn Trace Mode 
Supervisor Trace Mode 
Breakpoint Troce Mode 


Instruction Trace 

Branch Trace 

Call Trace 

Return Trace 

Prereturn Trace 

Supervisor Trace 

Breakpoint Trace 

Data Address Breakpoint 0 

Data Address Breakpoint 1 

Instruction Address Breakpoint 0 

Instruction Address Breakpoint 1 


3 Reserved 

(Initialized to 0) 

270669-15 


Figure 3-12. Trace Control Register 


Supervisor Trace Event — Signalled each time a calls 
instruction is executed where the selected entry type is 
supervisor, or when a ret from supervisor mode is exe- 
cuted. 

Breakpoint Trace Events — Signalled each time a mark 
instruction, fmark instruction, or specified address is 
encountered in the instruction stream. The mark in- 
struction signals an event when the breakpoint trace 
mode is enabled, the fmark (force mark) instruction 
will generate a breakpoint trace event regardless of the 
value of the breakpoint trace mode bit. 

Two IP breakpoint registers and two internal data ad- 
dress breakpoint registers are provided on the 
80960CA. These breakpoints are loaded with an in- 
struction or data address using the system control 
(sysctl) instruction. When the address is encountered 
and the breakpoint trace mode bit is set, a breakpoint 
trace event occurs. A corresponding instruction or data 
address event flag is set in the TC register when the 
address is encountered. 


3.10.5 PROCESSOR INITIALIZATION 

The Initial Memory Image (IMI) are the data struc- 
tures needed to initialize the 80960CA (Figure 3-13). 
The initialization boot record, in reserved memory be- 
ginning at FFFFFFOOH, contains a pointer to the Proc- 
essor Control Block (PRCB). The PRCB in turn holds 
pointers to the data structures which are necessary to 
execute code on the 80960CA. The PRCB also holds 
several fields which contain information to initially 
configure the 80960CA. 


Processor initialization begins by asserting the RESET 
pin. At initialization the processor optionally performs 
an internal self-test. A bus confidence test is also per- 
formed by calculating a checksum of 8 words read from 
extern al memory. If either of these self-tests fails, the 
FAIL pin indicates the failure and the processor aborts 
initialization. If the self-test passes, the 80960CA con- 
tinues with initialization and branches to the first ad- 
dress of the user’s code. 
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Fixed Doto Structures 
Address Initialization Boot Record: 
FFFFFFOOH 


FFFFFF10H 

FFFFFFUH 

FFFFFF18H 


FFFFFF2CH 


Bus 

Configuration 
(Least Significant Byte) 


First Instruction Pointer 


PRCB Pointer 


6 Check Words 

(for bus confidence self-test) 


Relocotable Data Structures 


User Code: 


R 


} 


Process Control Block (PRCB): 


Byte Offset 


Fault Table Base Address 


Control Table Base Address 


AC Register Initial Image 


Fault Configuration Word 


Interrupt Table Base Address 


System Procedure Table 
Base Address 


Reserved ;; 


Interrupt Stack Pointer 


Instruction Cache 
Configuration Word 


Register Cache 
Configuration Word 


OH 

4H 

8H 

CH 

10H 

14H 

18H 

1CH 

20H 

24H 


Control Table 




Interrupt Table 


-?■ 


System Procedure Table 


Other Architecturally Defined 
Data Structures 
(not required as part of IMI) 


Figure 3-13. Initial Memory Image 
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4.0 80960CA SYSTEM 
IMPLEMENTATION 

This section is an overview of the peripherals integrated 
with the 80960CA core. The features and operation of 
the Bus Controller, DMA Controller, Interrupt Con- 
troller, and the interfaces between these peripherals and 
the core are described. 


4.1 Peripheral Interface 

A program communicates with the on-chip peripherals 
by reading or modifying the special function registers 
(SFRs) or by loading control registers. The SFRs gen- 
erally serve to transfer status information and data be- 
tween a peripheral and the core, and the control regis- 
ters serve to configure the peripherals. SFRs are ac- 
cessed directly as instruction operands. The control 
registers are loaded by using the system control (sysctl) 
instruction. 


4.2 Bus Controller Unit 

The Bus Controller Unit (BCU) manages the data and 
instruction path between the 80960CA and external 
memory. Data operations and instruction fetches share 
a 32-bit data bus. Memory addresses are output on a 
separate 32-bit address bus. The BCU incorporates sev- 
eral advanced features to simplify the bus interface to 
external memory. A programmable memory region con- 
figuration table allows the characteristics of the exter- 
nal bus to be programmed differently for 16 separate 
regions in memory. The attributes of the external bus 
which are programmable include wait states and exter- 
nal ready control, data bus width (8, 16, or 32 bits), 
burst mode, address pipelining, and byte ordering. The 
region programmable bus options are described in this 
section. 

4.2.1 BUS TRANSFERS, ACCESSES, AND 
REQUESTS 

The distinction between transfer, bus access, and bus 
request, as these terms apply to the 80960CA, must be 
presented before beginning a discussion of the BCU. 

Transfer — A bus transfer is defined simply as a move- 
ment of code or data between a memory system and the 
80960CA. A write transfer occurs when the memory 
system is the destination of a data movement, A read 
transfer occurs when the 80960CA is the destination 
for a data or a code fetch from memory. 

Bus Access — A bus access is defined as an address cycle 
and one or more transfers. In burst mode, an access can 
consist of a single address cycle and 1 to 4 transfers. 


Bus Request — A bus request is issued by the core and 
directed to the Bus Controller. A bus request is sent to 
the BCU when a load, store, or an atomic instruction is 
executed, or when an instruction fetch is needed. Bus 
requests are also issued by the core to perform DMA 
transfers. A bus request can consist of one or more bus 
accesses. For example, an aligned word (32-bit) request 
to an 8-bit memory region will result in four byte- 
length accesses. 

4.2.2 BUS CONTROL COPROCESSOR 

The 80960CA’s peripherals are often referred to as co- 
processors, since their operation is decoupled from the 
execution of the instruction stream. As an integrated 
coprocessor, the BCU receives bus requests and inde- 
pendently carries out the action of moving data or code 
between the processor and external memory. The BCU 
uses a three deep queue to store pending bus requests. 
The queue decouples the core from the BCU, since a 
series of adjacent requests may be issued faster than the 
BCU can service each request. Two of the three queue 
entries store requests from a user’s program (loads, 
stores, fetches, etc.). The third queue entry is used by 
requests originating from a DMA operation. This 
queue entry takes user requests when the DMA is 
turned off. The 80960CA alternates service of requests 
issued by the user program and requests issued by a 
DMA operation. 

4.2.3 SIGNAL DESCRIPTIONS 

The external bus signals consist of 30 address signals, 4 
byte enables, 32 data lines, and various control signals. 

D31-D0 32-bit Data Bus (bi-directional) — 32-, 16-, 
and 8-bit values are transmitted and re- 
ceived on these lines. The 8- and 16-bit 
quantities are transferred on the low order 
data lines when a memory region is config- 
ured respectively for an 8- or 16-bit bus. 

A31-A2 30-bit Address (outputs) — The 30-bit ad- 
dress bus identifies all external addresses to 
word (4-byte) boundaries. The byte enable 
lines indicate the selected byte in each 
word. 

BE3-BE0 Byte Enables (outputs) — The byte enables 
select which of 4 addressed bytes are active 
in a memory access. When a memory re- 
gion is c onfigu red for an 8-bit bus width, 
BEl and BEO act as the lower two bits of 
the addres s. F or a 16-bit memory region, 
BEl , BE3 , and BEO are encoded to provide 
Al, BHE, and BLE respectively. 

W/R Write or Read (output) — This signal is low 
for read accesses and high for write access- 
es. 

ADS Address Strobe (output) — Indicates valid 
address and the start of a new bus access. 
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DT/R 

Data Transmit or Receive (output) — 
Direction control for data transceivers; 
similar to W/R. 

LOCK 

Lock (output) — Indicates that an 
atomic memory operation is in prog- 
ress. This signal can be used to inhibit 

DEN 

Data Enable (output) — Low during a 
bus request after the first address cy- 


external agents from modifying memo- 
ry which is atomically accessed. 


cle. This signal is used to control data 
transceivers and to indicate the end of 

BLAST 

Burst Last (output) — Indicates the last 
transfer in a burst access. 


a bus request. 

HOLD 

Hold (input) — HOLD can be used by 

WAIT 

Wait (output) — Indicates that wait 
states are being inserted by the internal 
wait state generator. 


a bus requester to request access to the 
bus. The processor asserts HLDA af- 
ter the current bus request or locked 

READY 

Ready (input) — Signals that data is 


requests have completed. 


valid for a read transfer or ends data 
hold for a write transfer. This function 
can be disabled for a memory region. 

HOLDA 

Hold Acknowledge (output) — Indi- 
cates to a bus requester that the proc- 
essor has relinquished control of the 

BTERM 

Burst Terminate (input) — Terminates 


bus. 


a burst access. Another address is gen- 
erated to complete the request when 
the signal is deasserted. This function 
can be disabled for a memory region. 

BREQ 

Bus Request (output) — Indicates that 
requests are queued in the bus control- 
ler and are waiting to be serviced. 
BREQ can be used for external bus ar- 

D/C 

Data or Code (output) — Indicates a 
data transfer or a code fetch. 


bitration logic in conjunction with 
HOLD and HLDA to regain bus mas- 

DMA 

DMA Access (output) — Indicates that 


tership. 


a bus request was initiated by either 

Figure 4-1 

shows the timing, for a simple, non-burst, 


the user program or the DMA. 

non-pipelined read and write access. The timing rela- 

SUP 

Supervisor Access (output) — Indicates 
that a bus access originated from a bus 
request issued in supervisor mode. 
This signal can be used to protect sys- 
tem data structures, or peripherals 
from errant modification by the user 
code. 

tions for the key control signals are shown in this fig- 
ure. 



Figure 4-1. Basic Read and Write Request 
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4.2.4 MEMORY REGION CONFIGURATION 
TABLE 


The BCU can be configured differently for 16 separate 
sections (referred to as regions) of the address space. 
The four most significant bits of a memory address de- 
fine the location of each region in memory. The bus 
characteristics in a region are specified in the memory 
region configuration table. When a bus request is serv- 
iced, the BCU accesses the configuration table entry for 
the region addressed and services the request based on 
the bus characteristics programmed for that region. 
The characteristics programmed for each region are 
listed below: 


— Burst Mode (on/off) 

— Wait States 

(5 parameters) 

— Bus Width 

(8-, 16-, or 32-bit) 


— Ready Inputs (on/off) 

— Address Pipelining 
(on/off) 

— Byte Ordering 
(Big/Little Endian) 


The configuration table is made up of 16 on-chip con- 
trol registers (Figure 4-2). Each register is programmed 
with the configuration information for a single region. 
Since the region table is located on-chip, access to re- 
gion information does not affect the performance of the 
bus. 

4.2.4. 1 Burst Accesses 

The 80960CA BCU is capable of burst accesses to 
memory systems which are designed to support this fea- 
ture. Burst mode is intended to get the most perform- 
ance from low cost memory systems. A burst access is a 
single address cycle followed by successive data or in- 
struction transfers. The transfers reference data or in- 
structions at sequential addresses starting at the address 
which began the burst access (Figure 4-3). In a burst 
memory system, the upper 28 bits of an address remain 
fixed while the lower two bits A2 and A3 increment to 
access subsequent locations. 


The flexibility of region programming simplifies the bus 
interface in applications where a memory system is 
made up of a variety of sub-systems, such as SRAM, 
DRAM, ROM, and memory mapped peripherals. Each 
memory sub-system can be mapped into a different re- 
gion in memory, and that region can be configured spe- 
cifically for the requirements of the particular memory 
sub-system. 


Wait state timing for the first access of a burst request 
is controlled independently from the timing for subse- 
quent accesses. A memory sub-system using static col- 
umn mode or page mode DRAMs, for example, can 
take advantage of the short column access times for 
these devices by using burst mode. Interleaved ROM or 
EPROM systems can also be constructed which simul- 
taneously access several words and then use burst mode 
to multiplex the multi-word array onto the data bus. 


MEMORY REGION 



REGION TABLE ENTRY 



Figure 4-2. Memory Region Configuration Tabie 
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Read Request: 
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Data 
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Figure 4-3. Burst Memory Request 
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Read Request: 


Wait State 
Counter 

Clock 
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0 2 1 
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- 2 1 Nrqp 
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CD 

Bus Access 
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Write Request: 


Wait State 
Counter 

Clock 

Address 

Data 
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SIC 


n X i wmr 


Mxf 
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Data Transfer C 

Bus Access I"" . 
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Figure 4-4. Programmable Wait States 
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4.2.4.2 Programmable Wait State Generation 

The 80960CA may be interfaced with a variety of mem- 
ory sub-systems and peripherals with a minimum sys- 
tem cost and complexity. To achieve this interface flexi- 
bility, the 80960CA implements an internal program- 
mable wait state generator. Internally generated wait 
states eliminate the potential system delays which come 
from generating wait states with external logic. 

Wait states are programmed for each region in the 
memory region configuration table. The number of wait 
states is programmable over a range which allows effi- 
cient control of memory devices ranging from ultra-fast 
SRAMs to slow peripherals. An external ready signal is 
also provided for external wait state control. 

The wait states which can be generated by the 
80960CA are shown in Figure 4-4. In this table N is the 
number of wait states inserted. The wait states for read 
accesses and for write accesses are described by three 
parameters each. For read accesses, Nrad is the num- 
ber of states between the address cycle and the first 
data cycle and Nrdd is the number of states between 
consecutive data cycles in a burst access. For writes, 
Nwad is the number of states that data is held after an 
address cycle, and Nwdd is the number of states that 
data is held for consecutive data cycles in a burst write. 
For both reads and writes, Nxda is the number of 
dead cycles after the last data cycle and before the next 
address. 

4.2.4.3 READY Control 

The memory region configuration table allows the 
ready input (READY) to be enabled or disabled for 
each region. If the ready input is disabled, the external 
input has no effect on the wait states generated for a 
memory access; all wait states are generated internally. 
If the ready input is enabled, it works in conjunction 
with the programmable wait state generator. In this 


case, the ready input has no effect until the number of 
programmed wait states has expired. When the wait 
state counter reaches 0, the ready input is sampled, and 
wait states continue or are terminated based on the val- 
ue of the ready input. In order to gain complete exter- 
nal control over wait states, all wait state parameters 
for a region can be set to 0. 

4.2.4.4 Pipelined Reads 

The 80960CA BCU provides an address pipelining 
mode (Figure 4-5) to optimize the performance of in- 
struction and data fetches from external memory. 
When the pipelined read mode is enabled, an address 
cycle overlaps with the last data cycle in each access, 
effectively reducing the total time needed for each ac- 
cess. Pipelining mode is selected in each region by pro- 
gramming the memory region configuration table. 

4.2.4.5 Byte Ordering 

One of two configurations for byte ordering, often re- 
ferred to as little endian or big endian, is selected for 
each region by programming the memory region con- 
figuration table. The byte ordering options make the 
80960CA capable of sharing memory with a processor 
which uses either byte ordering scheme. Byte ordering 
refers to the way that the 80960CA relates internal data 
to the way that data is stored or fetched from memory. 
The little endian configuration orders the bytes in a 
short-word or word so that the least significant byte of 
the quantity is positioned at the lowest address and the 
most significant byte at the highest address in memory. 
Conversely, for the big endian configuration, the least 
significant byte is positioned at the highest address, and 
the most significant byte at the lowest address. For ex- 
ample, for little endian ordering, byte 0 for word data 
would be found in memory at an address of the form 
XXXX XXXOH and, for big endian, at address XXXX 
XXX3H. 
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4.2.4.6 Data Alignment 

The 80960CA can service any aligned or non-aligned 
bus request. Aligned requests are directed to their natu- 
ral boundary in memory. In other words, the addresses 
for aligned requests are even multiples of the length of 
the data transferred. Non-aligned requests are not serv- 
iced directly by the BCU but are assisted by microcode. 
Microcode automatically breaks non-aligned requests 
into multiple aligned requests which are then reissued 
to the BCU. Depending on the degree of non-alignment 
and the length of the original request, the resulting re- 
quests by microcode will consist of a combination of 
byte, short-word, and double-word requests. The BCU 
is able to generate an operation-unaligned fault when a 
non-aligned bus request is first received. This fault can 
be selectively masked at initialization. 


4.3 DMA Controller 


DACK3- 

DACKO 


DMA Acknowledge (output) — This 
output becomes active when the re- 
questing device is accessed. 


E OP3/TC3- End of Process (input) or Terminal 
EOPO/TCO Count (output) — This pin functions ei- 
t her as an input (EOPx) or as an output 
(TCx). When programmed as an out- 
put, the pin is driven active for one 
clock after byte count reaches zero and 
a DMA terminates. When programmed 
as an input, an external device can 
cause the DMA operation to terminate. 


4.3.2 DMA TRANSFERS 

The 80960CA DMA controller supports a variety of 
transfer modes and variations of these modes, allowing 
the DMA to adapt to a number of hardware systems 
and the performance requirements of these systems. 


The DMA controller is a high-performance, full-func- 
tioned integrated peripheral. The DMA controller can 
manage 4 channels of DMA transfer concurrent with 
program execution. Separate external control for each 
channel is provided. Each channel supports high-per- 
formance memory to memory transfers where the 
source and destination can be any combination of inter- 
nal data RAM or external memory. The DMA Con- 
troller supports various types of transfers such as high- 
speed fly-by transfers and data chaining with the use of 
linked descriptor lists in memory. 

The 80960CA’s DMA controller is implemented using 
dedicated hardware and microcode. Because of the effi- 
ciency of the core, it is possible for the microcode to 
execute DMA transfers at high speeds. DMA transfers 
are performed by the core concurrently with execution 
of the user’s program. Internal DMA logic is used for 
sampling requests, synchronizing transfers with exter- 
nal devices, and handling the service of multiple active 
channels. 


4.3.1 SIGNAL DESCRIPTIONS 

Twelve pins are dedicated to the DMA controller. 
Three pins are associated with each DMA channel. 
These pins are described below. In this description, the 
pin number co rrespond s to the channel number. For 
example, the DREQO pin is the request pin for 
channel 0. 

DREQ3- DMA Request (input) — This input in- 
DREQO dicates that an external device is re- 
questing a DMA transfer. A DMA 
transfer refers to the complete transfer 
of one byte, short-word, word, or quad- 
word, depending on the transfer data 
width selected for the channel. 


4.3.2.1 Standard Block and Demand Mode 
Transfers 

A standard DMA transfer is made up of multiple bus 
requests. Loads from a source address are followed by 
stores to a destination address. The DMA controller 
issues the proper combination of these bus requests to 
execute the DMA transfer. For example, a typical 
DMA transfer between memory and an 8-bit peripheral 
could appear as a single byte load request directed to 
the source memory, followed by a single byte store re- 
quest directed to the 8-bit peripheral. 



The DMA controller has two basic transfer modes: 
block mode (unsynchronized) and demand mode (syn- 
chronized). Any DMA transfer will be serviced by one 
of these basic transfer modes. 


A block mode DMA is initiated by software. Block 
mode DMAs are generally between memory. Block 
mode DMA transfers are not synchronized with any 
type of request from an external device. Once the DMA 
begins, it will continue until the entire block is com- 
plete or until it is suspended. The source and destina- 
tion addresses for block mode transfers can be incre- 
mented or held constant for a DMA. 


A demand mode DMA is controlled by an external 
device. Demand mode DMAs are generally between an 
external device and memory. In demand mode, each 
individual DMA transfer can be synchronized with a 
request. The request is signalled when an ex ternal de - 
vice acti vates a DMA channel request pin (DREQ3- 
DREQO). The DMA controller acknowledg es this re - 
quest wi th the DMA acknowledge pin (DACK3- 
DACKO) when the requesting device is accessed. A de- 
mand mode transfer may be synchronized with either 
the source or the destination device. 
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4.3.2.2 Fly-by Transfers 

A fly-by transfer mode is provided for the most per- 
formance-critical DMA applications. Fly-by mode also 
makes very efficient use of the external bus during a 
DMA. Standard DMA transfers involve multiple bus 
requests: load requests directed to the source and a 
store request directed to the destination. Fly-by trans- 
fers only require a single bus request. For a fly-by trans- 
fer, memory sees a load or a store on the bus while the 
requesting device is selected by the DMA acknowledge 
pin. The data is never actually read from or written to 
the 80960CA. For memory to device transfers, the 
processor issues a load, and, while reading the memory, 
accesses the external device with the DMA acknowl- 
edge pin. The data is then written directly to the desti- 
nation device with a single bus request. For a device to 
memory transfer, the reverse operation is performed. 
The DMA issues a store, and, while writing the memo- 
ry, accesses the source device with the DMA acknowl- 
edge pin. In this case, the processor floats the data bus 
and the device’s data is written directly into memory. 

4.3.2.3 Data Chaining 

Each DMA channel can be programmed in a data 
chaining mode. In this mode, all transfer information is 
taken from a linked-list descriptor in memory (Figure 
4-6). Data chaining is started by specifying a pointer to 
a descriptor in memory. The transfer continues until 


the number of bytes in the byte count field in the de- 
scriptor is transferred. At this time, another linked-list 
descriptor may be executed. The next descriptor is 
specified by the next-pointer field in the current de- 
scription. Data chaining continues until a null pointer 
is encountered in the next-pointer field. Data chaining 
can be designated as source chaining, destination chain- 
ing, or both. 

In data chaining mode, an option exists which allows 
chaining descriptors to be updated while the DMA is 
running. When this option is enabled, the DMA sets a 
bit in the DMA’s special function register after loading 
a descriptor and then checks this bit before loading the 
next descriptor. If the bit has been cleared by the user, 
the DMA continues; otherwise, the DMA waits for the 
next descriptor to be set up and for the user to clear the 
bit. An interrupt can be generated when each buffer is 
complete or when the DMA is terminated with a null 
pointer or the EOF pin. 

4.3.3 TRANSFER CHARACTERISTICS 

The DMA controller provides the programmer with a 
number of options for configuring the characteristics of 
a DMA transfer. Intelligent selection of transfer char- 
acteristics works to balance DMA performance and 
functionality with performance of the user program 
when the DMA is in progress. 


Internal Register 



Figure 4-6. Source Data Chaining 
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The DMA controller provides features to optimize 
transfers by moving a maximum amount of data for 
each bus request issued. This is controlled by specifying 
the width of the source and destination directed bus 
requests for a DMA transfer, and by on-chip assembly 
or disassembly of the transfer when source and destina- 
tion are not of equal widths. 

Data alignment is performed automatically by the 
DMA controller when the source and destination of a 
transfer are not aligned. The alignment algorithm is 
optimized for many transfers, providing a performance 
comparable to the aligned transfer cases. 

4.3.3. 1 Transfer Data Length 

The transfer data length specifies the length of bus re- 
quests directed to the source and destination in a stan- 
dard DMA transfer. Byte, short, word, or quad-word 
loads and stores are selected for either source or desti- 
nation when a DMA channel is set up. Assembly and 
disassembly of data is automatically performed when 
the source and destination widths are different. This 
feature provides the most efficient use of the bus when 
DMA transfers occur between a source and a destina- 
tion with different external bus widths. 


The DMA controller provides the option of using quad 
word transfers to enhance DMA performance. When 
quad transfers are specified, the DMA will request a 
four-word load request and four-word store request for 
each DMA transfer. The trade-off for the added DMA 
performance is latency on the external bus, preventing 
requests by the core, or by another DMA channel from 
being immediately serviced. 

4.3.3.2 Data Alignment 

The DMA controller supports transfer of source and 
destination data aligned to different byte boundaries in 
memory. The DMA implements microcode algorithms 
to transfer some non-aligned data with a performance 
level approaching that for aligned transfers. The DMA 
accomplishes this by attempting to i^sue the maximum 
number of aligned bus requests during a DMA (Figure 
4-7). As shown, most of the overhead due to non- 
aligned DMAs is incurred at the beginning and end of 
the DMA. DMAs with low byte counts, therefore, do 
not benefit as much from the data alignment features of 
the DMA. The alignment feature is optimized for 8-bit 
to 8-bit, 32-bit to 32-bit and for 8-mt and 32-bit combi- 
nations of source and destination lengths. 
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4.3.3.3 Channel Priority 

The DMA controller arbitrates the priority of the 4 
DMA channels. If multiple DMA channels are en- 
abled, the DMA controller will determine in which or- 
der each channel is serviced. 

The DMA controller can be configured in one of two 
priority modes, fixed mode or rotating mode. The fixed 
mode assumes a fixed priority for each channel with 
channel 0 having the highest priority, followed by chan- 
nels 1, 2, and 3, with channel 3 having the lowest prior- 
ity. The rotating mode updates a channel’s priority to 
the lowest priority after that channel’s DMA is made. 
This insures that a single channel is never locked out by 
other active channels. The priority sequence is always 
in the same order, with priority rotating from the low 
channel numbers to the high channel numbers. 

4.3.3.4 Performance and Latency 
Considerations 

DMA operations and the user program share the re- 
sources of the core and of the external bus. DMA per- 
formance and the performance of the user program are 
coupled directly to the balance of load sharing between 
these two processes. The core resources necessary to 
perform a DMA transfer vary depending on the way a 
channel has been configured. For example, byte assem- 
bly and disassembly requires more processor overhead 
per byte of transfer than does a transfer in which the 
source and destination transfer lengths are equal. The 
performance of a DMA is also tightly coupled to the 
user program’s use of the external bus. If the user pro- 
gram does not make frequent bus requests, the requests 
by the DMA controller will be serviced with little or no 
delay. 

The user can enhance performance of the DMA with 
trade-offs in system complexity and flexibility. Aligned 
transfers eliminate the microcode overhead needed to 
perform the internal alignments. DMAs between re- 
gions of equal transfer widths eliminate overhead for 


assembly and disassembly. Source or destination mem- 
ory configured as burst memory will provide the most 
efficient use of the DMA controller when the quad- 
transfer feature is enabled. Using the fly-by mode re- 
duces the number of bus requests needed for a DMA 
since fly-by mode uses only a single load or a single 
store request for each transfer. 

4.3.4 DMA CONTROL AND CONFIGURATION 

The DMA Controller uses an SFR register, the DMA 
command (DMAC) register, and the setup DMA 
(sdma) instruction for configuration and control of a 
DMA. The sdma instruction is used to configure each 
DMA channel. Transfer widths, byte count, source and 
destination addresses for a DMA are specified in this 
instruction. 

The DMAC register (Figure 4-8) is described below. 

The channel enable field enables a DMA once the 
channel is set up. Clearing these bits will also cause a 
DMA transfer to be suspended. 

The terminal count field signals that byte count has 
reached zero and a DMA has ended. 

The channel active field indicates that a channel is idle 
or active. If set, this bit indicates that the channel is 
active. This implies that the channel is servicing a 
transfer or has a request pending. The active bits are 
status information only. 

The channel done field indicates that a DMA operation 
is complete. The done bits are status information only. 

The channel wait field is used for handshaking with a 
user program in data chaining mode. The DMA sets 
these bits when a new linked-list descriptor is read. The 
DMA will not read the next descriptor until this bit is 
cleared by the user. The user can set up the next de- 
scriptor and then clear the channel wait bits to dynami- 
cally change descriptors. 
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Figure 4-8. DMA Command Register 
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A priority mode bit selects rotating or fixed priority 
mode. 

The throttle bit selects the maximum amount of core 
resources that the DMA microcode will receive in rela- 
tion to the execution of the user program. 

4.3.5 DMA INTERRUPTS 

The DMA controller is the source of 4 hardware inter- 
rupts in the 80960CA. The DMA Controller can be 
programmed to request an interrupt when a DMA is 
complete, or when a buffer transfer is completed in 
chaining mode. Each channel requests a different inter- 
rupt. 


troller allows the 8 interrupt pins to be configured as 
dedicated inputs capable of requesting 8 interrupts, or 
as a vectored input capable of requesting up to 248 
interrupts. The NMI pin is always a dedicated input. 
The interrupt controller pins are described below. 

X INT7- External Interrupts (inputs) — These pins 
XINTO can be used as dedicated inputs, or acting 
together as an 8-bit number, request any in- 
terrupt. The inputs are edge or level detect- 
ed, and are optionally debounced internally. 

NMI Non-Maskable Interrupt (input) — NMI re- 
quests the highest priority interrupt. NMI 
is always taken and is not maskable (as the 
name implies), and not interruptable. 


4.4 Interrupt Controller 

The 80960CA Interrupt Controller manages interrupts 
which are requested by external agents or by the DMA 
Controller. The interrupt controller manages 4 internal 
DMA interrupt sources, a single NMI (Non-Maskable 
Interrupt) pin, and 8 external interrupt pins. Up to 248 
external interrupt sources can be supported by the in- 
terrupt controller. The interrupt controller handles the 
prioritization of software interrupts, hardware inter- 
rupts, and the process priority, and signals the core 
when interrupts are to be serviced. The interrupt con- 
troller provides the low-latency interrupt service fea- 
tured on the 80960CA. 

4.4.1 EXTERNAL INTERRUPTS 

The 80960CA provides 8 interrupt pins and one NMI 
pin for detecting external requests. The interrupt con- 


4.4.2 INTERRUPT MODES 

The 8 external interrupt pins can be configured in one 
of three modes: dedicated mode, expanded mode, or 
mixed mode (Figure 4-9). 

4.4.2.1 Dedicated Mode Interrupts 

In dedicated mode, each of the 8 interrupt pins acts as a 
dedicated input. When an external event is detected on 
an interrupt pin, a unique interrupt is requested for that 
pin. It is possible to map each dedicated pin to one of a 
number of possible interrupt vectors. This is accom- 
plished by programming the interrupt map (IMAP) 
control registers with an interrupt vector number for 
each pin. (Recall that interrupt vector numbers are 
8-bit values which reference the 248 vectors in the in- 
terrupt table.) 


Dedicated Mode 


Expanded Mode 
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Only the upper four bits of the vector number can be 
programmed for a dedicated mode interrupt. The lower 
four bits are fixed at the value OOIO 2 . With four pro- 
grammable bits, one of 15 interrupt vectors is available 
for each dedicated pin. These interrupt vectors span the 
even priority levels from priority 2 to 30. The vector at 
priority 0 is not defined. 

The 15 interrupt vectors available to dedicated sources 
can be cached in internal data RAM. If this interrupt 
vector caching feature is selected, the processor will au- 
tomatically fetch the vector from data RAM, eliminat- 
ing the latency caused by a bus request for a vector in 
external memory. 

The DMA Controller can request four interrupts to sig- 
nal the end of a DMA for each of four channels. The 
four interrupt signals from the DMA are handled by 
the interrupt controller in the same way as an interrupt 
pin configured as a dedicated input. Each of the four 
DMA sources may request one of 1 5 interrupts by pro- 
gramming the IMAP for that source. 

4.4.2.2 Expanded Mode Interrupts 

In expanded mode, e x ternal h ardware considers the in- 
terrupt pins (XINT0-XINT7) as an 8-bit binary num- 
ber. This number is used directly as the interrupt vector 
number. Each of the 248 possible interrupt vectors can 
be referenced in this way, allowing a separate external 
source for each vector. External hardware is responsi- 
ble for recognizing individual hardware sources and 
then driving the interrupt vector number corresponding 
to that source onto the interrupt pins. 

4.4.2.3 Mixed Mode Interrupts 

In mixed mode, the 8 interrupt pins are divided into 
two functional sets. One set functions in dedicated 


mode, the other in expanded mode. In mix ed mode , 
three p ins are dedicated interrupt pins (XINT7- 
XINT5). A programmable vector number is associated 
with each of t hese pin s. The remaining five interrupt 
pins (XINT4-XINT0) are treated as the most signifi- 
cant five bits of the expanded mode vector number. The 
lower order bits are internally forced to OIO 2 to form 
the full 8-bit value for the vector number. 


4.4.3 INTERRUPT CONTROLLER SETUP 

The interrupt controller uses two special function regis- 
ters to manage interrupt requests by hardware sources. 
The hardware interrupt pending register (IPND) and 
the hardware interrupt mask register (IMSK) are ad- 
dressed as sfD and sfl respectively. A single bit in each 
register corresponds to each of the 8 possible external 
sources and 4 DMA sources for hardware interrupts. 
The IMSK register performs the function of masking 
hardware interrupts and the IPND register implements 
posting of interrupts requested by hardware. When 
configured for expanded or mixed mode interrupts, bit 
0 of the IMSK register globally masks the expanded 
mode interrupts. 

4.4.4. NON-MASKABLE INTERRUPT 

In addition to the maskable hardware interrupts, a sin- 
gle Non-Maskable Interrupt (NMI) is provided. A dedi- 
cated NMI pin is used to request this interrupt. NMI is 
defined as a higher priority than any hardware inter- 
rupt, software interrupt, or process priority. The NMI 
procedure, therefore, can never be interrupted and 
must execute the return instruction before other proce- 
dures can execute. The NMI procedure is entered 
through vector 248. This vector is cached in internal 
data RAM at initialization to reduce latency for the 
NMI. 
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APPENDIX A 

80960CA CORE IMPLEMENTATION 


The 80960CA Core is a high-performance implementa- 
tion of the 80960 Core Architecture. This section brief- 
ly describes the microarchitecture of the 80960CA core 
and the key constructs used to achieve parallel instruc- 
tion execution. 

The 80960CA core can be divided into the 6 main sub- 
units listed below. 

— Instruction Sequencer 

— Register File 

— Execution Unit 

— Multiply and Divide Unit 

— Address Generation Unit 

— Static Data RAM and Local Register Cache 

Figure A-1 is a simple block diagram of the 80960CA. 
The nucleus of the processor is the Instruction Se- 
quencer and Register File. The other subunits of the 
core, referred to as coprocessors, radiate from these 
units, connecting to either the register (REG) side or 
the memory (MEM) side of the processor. The Instruc- 
tion Sequencer issues directives, via the REG and 
MEM interfaces, which target a specific coprocessor. 
That coprocessor then executes an express function vir- 
tually decoupled from the IS and the other coproces- 


sors. The REG and MEM data busses shown in Figure 
A-1 are used to transfer data between the common 
Register File and the coprocessors. 


A.1 Instruction Sequencer 

The Instruction Sequencer (IS) decodes the instruction 
stream and drives the decoded instruction stream onto 
the coprocessor interfaces. In a single clock, the IS de- 
codes up to 4 instruction and issues up to three of these 
instructions to the on-chip coprocessors or to the IS 
itself. One register (REG) format, one memory (MEM) 
format, and one control or control and branch (CTRL 
or COBR) format instruction can be issued at one time. 
These instructions are directed respectively to the REG 
coprocessors, the MEM coprocessors, or to the IS. The 
ability to issue multiple instructions in parallel can re- 
sult in the simultaneous execution of many instructions 
at once. An optimizing compiler or hand optimization 
of assembly code can easily produce an instruction 
stream which takes full advantage of the parallel execu- 
tion of the core. 

A technique known as resource scoreboarding is used to 
manage the parallel execution of instructions and the 
common resources of the processor. A coprocessor, for 
example, can scoreboard itself, indicating that it cannot 



270669-30 


Figure A-1. 80960CA Block Diagram 

3-163 
















act on another instruction until an instruction currently 
executing on that coprocessor is completed. A specific 
form of resource scoreboarding is referred to as register 
scoreboarding. When the computation stage of an in- 
struction takes more than one clock, the destination 
register or registers for the result are scoreboarded as 
busy. A subsequent operation needing that particular 
register will be delayed until the multi-clock operation 
is completed. Instructions which do not use the score- 
boarded registers can be executed in parallel. 

The IS manages a three stage parallel instruction pipe- 
line (Figure A-2). In the first stage of the pipeline (pipe 
0), the address of the next instruction is calculated. 
This address may be the next sequential instruction, the 
target of a branch, or a location in microcode. In the 
second stage of the pipeline (pipe 1), the instructions 
are issued to the rest of the machine. In the third stage 
(pipe 2), the instruction computation is started, and for 
single cycle instructions, a result is returned. 

Several microarchitectural features of the core are de- 
signed to minimize performance loss due to pipeline 
breaks. 


ten to its destination register. Bypassing the register file 
saves the one clock cycle break which would otherwise 
occur while waiting for the value to be written to the 
register file and the register scoreboard to be cleared. 

On-chip Cache — The on-chip instruction cache and lo- 
cal register cache eliminate many pipeline breaks which 
will occur if the IS is forced to wait for code or data to 
be moved between the 80960CA and external memory. 

Register File Access — The Register File allows multiple 
instructions to gain access to the register set simulta- 
neously. This eliminates pipeline breaks which would 
be caused by a loss of access to the register set by any 
coprocessor. 

A.1.1 INSTRUCTION CACHE 

The IS includes a 1 Kbyte two-way set associative in- 
struction cache capable of delivering up to four instruc- 
tions each clock to the Instruction Sequencer. The 
cache allows inner loops of code to execute with no 
external instruction fetches. 


Branch Prediction — To minimize pipeline breaks due to A.1.2 MICROCODE ROM 
branching, the user can specify the direction that a con- 
ditional branch instruction will usually follow. The The 80960CA uses microcode ROM to implement corn- 

processor will execute along the specified instruction plex instructions and functions. This includes calls, re- 
path with ho pipeline break. If the branch direction turns, DMA transfers, and initialization sequences. Mi- 

specified was the direction actually selected by execu- crocode provides an inexpensive and simple method for 

tion of the conditional branch, no pipeline break oc- implementing complex instructions in the mostly RISC 

curs. The direction of the branch guess is determined environment of the 80960CA. When the IS encounters 

by a bit value in the CTRL format instructions. a microcoded instruction, it automatically branches to 

the microcode routine. The 80960CA performs this mi- 
Register Bypassing — Register bypassing is a feature crocode branch in 0 clocks, 
which forwards the result of an instruction for immedi- 
ate use as the source of another instruction. This for- 
warding occurs at the same time that the value is writ- 
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Pipe 1 

XXXXX 

issue 

issue 



Pipe 2 

XXXXX 

XXXXX 

execute & 
return 


1 1 


Figure A-2. Instruction Pipeline 
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A.2 Register File 

The Register File (RF) contains the 16 local and 16 
global registers. The register file has six ports (Figure 
A-3), allowing parallel access of the register set by sev- 
eral 80960CA coprocessors. This parallel access results 
in an ability to execute one simple logic or arithmetic 
instruction, one memory operation (load/store), and 
one address calculation per clock. 

MEM coprocessors interface to the RF with a 128-bit 
wide load bus and a 128-bit wide store bus. These bus- 
ses enable movement of up to 4 words per clock to and 
from the RF. These busses also allow LOAD data from 
a previous read access and STORE data from a current 
write access to be processed in the register file simulta- 
neously. An additional 32-bit port allows an address or 
address reduction operand to be simultaneously fetched 
by the Address Generation Unit. 

REG coprocessors interface to the RF with two 64-bit 
source busses and a single 64-bit destination bus. With 
this bus structure, two source operands are simulta- 
neously issued to a REG coprocessor when an instruc- 
tion is issued. A 64-bit destination bus allows the result 
from the previous operation to be written to the RF at 
the same time that the current operation’s source oper- 
ands are issued. 


A.3 Execution Unit 

The Execution Unit is the 32-bit Arithmetic and Logic 
Unit of the 80960CA Core. The EU can be viewed as a 
self-contained REG coprocessor with its own instruc- 
tion set. As such, the EU is responsible for executing or 
supporting the execution of all the integer and ordinal 
arithmetic instructions, the logic and shift instructions, 
the move instructions, the bit and bit field instructions, 
and the compare operations. The EU performs any 
arithmetic or logical instructions in a single clock. 


A.4 Multiply Divide Unit 

The Multiply and Divide Unit (MDU) is a REG coproc- 
essor which performs integer and ordinal multiply, di- 
vide, remainder, and modulo operations. The MDU de- 
tects integer overflow and divide by zero errors. The 
MDU is optimized for multiplication, performing 32- 
bit multiplies in 4 clocks. The MDU performs multi- 
plies and divides in parallel with the main execution 
unit. 


A.5 Address Generation Unit 

Address Generation Unit (AGU) is a MEM coproc- 
essor which computes the effective addresses for memo- 
ry operations. It directly executes the load address in- 
struction (Ida) and calculates addresses for loads and 
stores based on the addressing mode specified in these 
instructions. The address calculations are performed in 
parallel with the main execution unit (EU). 


A.6 Data RAM and Local Register 
Cache 

The Data RAM and Local Register Cache is part of a 
1.5 Kbyte block of on-chip Static RAM (SRAM). 
1 Kbyte of this SRAM is mapped into the 80960CA’s 
address space from location OOOOOOOOH to 
000003 FFH. A portion of the remaining 512 bytes is 
dedicated to the Local Register Cache. This part of 
internal SRAM is not directly visible to the user. Loads 
and Stores, including quad-word accesses, to the inter- 
nal SRAM are typically performed in only one clock. 
The complete local register set, therefore, can be moved 
to the local register cache in only four clocks. 
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Figure A-3. Six-Port Register File 
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32-BIT HIGH PERFORMANCE EMBEDDED PROCESSOR 


• Two Instructions/Ciock Sustained Execution 

• Four 59 Mbytes/s DMA Channels with Data Chaining 

• Demultiplexed 32-Bit Burst Bus with Pipelining 


■ 32-bit Parallel Architecture 

— Two Instructions/clock Execution 
— Load/Store Architecture 
— 16, 32-bit Global Registers 
— 16, 32-blt Local Registers 
— Manipulate 64-Bit Bit Fields 
— 11 Addressing Modes 
— Full Parallel Fault Model 
— Supervisor Protection Model 

■ Fast Procedure Call/Return Model 
— Full Procedure Call in 4 clocks 
— RISC Call in 2 clocks (BAL) 

■ On-Chip Register Cache 

— Caches Registers on Call/Ret 
— Minimum of 6 Frames provided 
— Number of Frames Programmable, 
up to 15 

■ On-Chip Instruction Cache 

— 1 Kbyte Two-Way Set Associative 

— 128-bit Path to Instruction Sequencer 
— Cache-Lock Modes 

— Cache-Off Mode 


■ High Bandwidth On-Chip Data Ram 

— 1 Kbytes On-chip RAM for Data 
— Sustain 128-bits per clock access 

■ Four On-Chip DMA Channels 

— 59 Mbytes/s Fly-by Transfers 

— 32 Mbytes/s Two-Cycle Transfers 
— Data Chaining 

— Data Packing/Unpacking 
— Programmable Priority Method 

B 32-Bit Demultiplexed Burst Bus 

— 128-Bit Internal Data Paths to and 
from Registers 

— Burst Bus for DRAM Interfacing 
— Address Pipelining Option 
— Fully Programmable Wait States 
— Supports 8, 16 or 32-bit Bus Widths 
— Supports Unaligned Accesses 
— Supervisor Protection Pin 

B High-Speed Interrupt Controller 
— Up to 248 External Interrupts 

— 32 Fully Programmable Priorities 
— Multi-mode 8-bit Interrupt Port 
— Four Internal DMA Interrupts 

— Separate, Non-maskable Interrupt Pin 
— Context Switch in 750 ns Typical 
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1.0 PURPOSE 

This document provides a preview of the electrical 
characteristics expected of the 33, 25 and 16 MHz 
versions of the 80960CA. For a detailed description 
of any 80960CA functional topic, other than para- 
metric performance, consult the latest 80960CA 
Product Overview (Order No. 270669), or the 
80960CA User’s Manual (Order No. 270710). 


2.0 80960CA OVERVIEW 

The 80960CA Is the second-generation member of 
the 80960 Family of embedded processors. The 
80960CA is object code compatible with the 32-bit 
80960 Core Architecture while including Special 
Function Register extensions to control on-chip pe- 
ripherals, and instruction set extensions to shift 64- 
blt operands and configure on-chip hardware. Multi- 
ple 128-bit internal busses, on-chip instruction cach- 
ing and a sophisticated instruction scheduler allow 
the processor to sustain execution of two instruc- 
tions every clock, and peak at execution of 3 instruc- 
tions per clock. 


A 32-bit demultiplexed and pipelined burst bus pro- 
vides a 132 Mbyte/s bandwidth to a system’s high- 
speed external memory sub-system. In addition, the 
80960CA’s on-chip caching of instructions, proce- 
dure context and critical program data substantially 
decouples system performance from the wait states 
associated with accesses to the system’s slower, 
cost sensitive, main memory sub-system. 

The 80960CA bus controller also Integrates full wait 
state and bus width control for highest system per- 
formance with minimal system design complexity. 
Unaligned access and Big Endian byte order support 
reduces the cost of porting existing applications to 
the 80960CA. 


The processor also integrates four complete data- 
chainlng DMA channels and a high-speed interrupt 
controller on-chip. The DMA channels perform: sin- 
gle-cycle or two-cycle transfers, data packing and 
unpacking, and data chaining. Block transfers, in ad- 
dition to source or destination synchronized trans- 
fers are provided. 

The interrupt controller provides full programmability 
of 248 Interrupt sources into 32 priority levels with a 
typical interrupt task switch (“latency”) time of 
750 ns. 





PROGRAMMABLE 
INTERRUPT CONTROLLER 


INTERRUPT 



1 mi 



MULTIPLY/DIVIDE 

UNIT 

i 


Figure 2. 80960CA Block Diagram 
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2.1. The C-Series Core 

The C-Series core is a very high performance micro- 
architectural implementation of the 80960 Core Ar- 
chitecture. The C-Series core can sustain execution 
of two instructions per clock (66 MIPs at 33 MHz). 
To achieve this level of performance, Intel has incor- 
porated state-of-the-art silicon technology and inno- 
vative microarchitectural constructs into the imple- 
mentation of the C-Series core. Factors that contrib- 
ute to the core’s performance include: 

— Parallel Instruction decoding allows Issue of up 
to three instructions per clock. 

— Most instructions execute in a single clock. 

— Parallel instruction decode allows sustained, 
simultaneous execution of two single-clock in- 
structions every clock cycle. 

— Efficient instruction pipeline is designed to mini- 
mize pipeline break losses. 

— Register and resource scoreboarding allow 
simultaneous multi-clock instruction execution. 

— Branch look-ahead and prediction allows many 
branches to execute with no pipeline break. 

— Local Register Cache integrated on-chip caches 
Call/Return context. 

— Two-way set associative, 1Kbyte integrated in- 
struction cache 

— 1Kbyte Integrated Data RAM sustains a four- 
word (128-blt) access every clock cycle. 

2.2. Pipelined, Burst Bus 

A 32-bit high performance bus controller Interfaces 
the 80960CA to external memory and peripherals. 
The Bus Control Unit features a maximum transfer 
rate of 132 Mbytes per second (at 33 MHz). Internal- 
ly programmable wait states and 1 6 separately con- 
figurable memory regions allow the processor to in- 
terface with a variety of memory subsystems with a 
minimum of system complexity and a maximum of 
performance. The Bus Controller’s main features In- 
clude: 


— Demultiplexed, Burst Bus to exploit most efficient 
DRAM access modes 

— Address Pipelining to reduce memory cost while 
maintaining performance 

— 32-, 16- and 8-bit modes for I/O interfacing ease. 

— Full internal wait state generation to reduce sys- 
tem cost 

— Little and Big Endian support to ease application 
development 

— Unaligned access support for code portability 

— Three-deep request queue to decouple the bus 
from the core 

— Direct interface to Intel’s 27C960 Burst EPROM 
and 82596 Ethernet Controller. 


2.3. Flexible DMA Controller 

A four channel DMA controller provides high speed 
DMA control for data transfers involving peripherals 
and memory. The DMA provides advanced features 
such as data chaining, byte assembly and disassem- 
bly, and a high performance fly-by mode capable of 
transfer speed of up to 59 Mbytes per second at 
33 MHz. The DMA controller features a performance 
and flexibility which is only possible by integrating 
the DMA controller and the 80960CA core. 


2.4. Priority Interrupt Controller 

A programmable-priority interrupt controller man- 
ages up to 248 external sources through the 8-bit 
external Interrupt port. The Interrupt Unit also han- 
dles the 4 internal sources from the DMA controller, 
and a single non-maskable interrupt input. The 8-bit 
interrupt port can also be configured to provide indi- 
vidual interrupt sources that are level, or edge trig- 
gered. 

Interrupts In the 80960CA are prioritized and sig- 
naled within 270 ns of the request. If the interrupt is 
of higher priority than the processor priority, the con- 
te>^t switch to the Interrupt routine typically is com- 
plete in another 480 ns. The interrupt unit provides 
the mechanism for the low latency and high through- 
put interrupt service which Is essential for embedded 
applications. 
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2.5. Instruction Set Summary 


The following table summarizes the 80960CA instruction set by logical groupings. See the 80960CA User’s 
Manual for a complete description of the instruction set. 


Data 

Movement 

Arithmetic 

Logical 

Bit, Bit Field 
and Byte 

Load 

Add 

And 

Set Bit 

Store 

Subtract 

Not And 

Clear Bit 

Move 

Multiply 

And Not 

Not Bit 

Load Address 

Divide 

Or 

Alter Bit 


Remainder 

Exclusive Or 

Scan for Bit 

. 

Modulo 

Not Or 

Span over Bit 


Shift 

Or Not 

Extract 


* Extended 

Nor 

Modify 


Shift 

Extended 

Multiply 

Extended 

Divide 

Add with 

Carry 

Subtract with 

Carry 

Rotate 

Exclusive Nor 

Not 

Nand 

Scan Byte for Equal 

Comparison 

Branch 

Call and Return 

Fault 

Compare 

Unconditional 

Call 

Conditional 

Conditional 

Branch 

Call Extended 

Fault 

Compare 

Conditional 

Call System 

Synchronize 

Compare and 
Increment 
Compare and 
Decrement 
Condition Test 
Check Bit 

Branch 

Compare and 

Branch 

Return 

Branch and Link 

Faults 

Debug 

Processor 

Management 

Atomic 


Modify Trace 

Modify 

Atomic Add 


Controls 

Mark 

Force Mark 

Process 

Controls 

Modify 

Arithmetic 

Controls 
“^System Control 
*DMA Control 

Flush Local 

Registers 

Atomic Modify 



NOTE: 

Instructions marked by (*) are 80960CA extensions to the 80960 Instruction set. 
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3.0 PACKAGE INFORMATION 

3.1. Package Introduction 

This section describes the pins, pinouts and thermal 
characteristics for the 80960CA in the 168-pin Ce- 
ramic Pin Grid Array (PGA) package and the 1 96 pin 
Plastic Quad Flat Package (PQFP). For complete 
package specifications and information, see the Intel 
Packaging Specification (Order # 231369). 


3.2. Pin Descriptions 

The 80960CA pins are described in this section. Ta- 
ble 1 presents the legend for Interpreting the pin de- 
scriptions in the following tables. 

The pins associated with the 32-bit demultiplexed 
processor bus are described in Table 2. The pins 
associated with basic processor configuration and 
control are described in Table 3. The pins associat- 
ed with the 80960CA DMA Controller and Interrupt 
Unit are described In Table 4. 

Figure 3 provides an example pin description table 
entry. The “I/O” signifies that the data pins are in- 
put-output. The “S” Indicates the pins are synchro- 
nous to PCLK2:1. The “H(Z)” Indicates that these 
pins float while the processor bus is in a Hold Ac- 
knowledge state. The “ R(Z)” no tation Indicates that 
the pins also float while RESET is low. 

All pins float while the processor is in the ONCETM 
mode. 


Table 1. Pin Description Nomenclature 


Symbol 

Description 

1 

Input only pin 

0 

Output only pin 

I/O 

Pin can be either an Input or output 

- 

Pins “must be” connected as 
described 

S(...) 

Synchronous. Inputs must meet setup 
and hold times relative to PCLK2:1 for 
proper operation of the processor. All 
outputs are synchronous to PCLK2:1 . 
S(E) Edge sensitive Input 

S(L) Level sensitive input 

A( . . . ) 

Asynchronous. Inputs may be 
asynchronous to PCLK2:1. 

A(E) Edge sensitive Input 

A(L) Level sensitive input 


While the processor’s bus is in the 

Hold Acknowledge or Bus Backoff 
state, the pin: 

H(1) is driven to Vcc 

H(0) is driven to Vss 

H(Z) floats 

H(Q) continues to be a valid output 

R(...) 

vyhile the processor’s RESET pin is 
low, the pin 

R(1) is driven to Vcc 

R(0) is driven to Vss 

R(Z) floats 

R(Q) continues to be a valid output 


Name 

Type 

Description 

D31:0 

I/O 

S(L) 

H(Z) 

R(Z) 

DATA BUS carries 32, 16 or 8-bit data quantities depending on bus width configuration. The 
least significant bit of the data is carried on DO and the most significant on D31 . When the 
bus is configured for 8 bit data, the lower 8 data lines, D7:0 are used. For 16 bit data widths, 
D1 5:0 are used. For 32 bit data the full data bus is used. 


Figure 3. Example Pin Description Entry 
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Table 2. 80960CA Pin Description— External Bus Signals 


Name 

Type 

Description 

A31:2 

O 

S 

H(Z) 

R{Z) 

ADDRESS BUS carries the upper 30 bits of the physical address. A31 is the most 
significant address bit and A2 is the least significant. During a bus access, A31 :2 
identify all external addresses to word (4-byte) boundaries. The byte enable 
signals indicate the selected byte in each word. During burst accesses, A3 and A2 
increment to indicate successive data cycles. 

D31:0 

I/O 

S(L) 

H(Z) 

R(Z) 

DATA BUS carries 32, 16 or 8-bit data quantities depending on bus width 
configuration. The least significant bit of the data Is carried on DO and the most 
significant on D31 . When the bus Is configured for 8 bit data, the lower 8 data 
lines, D7:0 are used. For 16 bit bus widths, D1 5:0 are used. For 32 bit bus widths 
the full data bus is used. 

BE3 

BE2 

BE1 

BEO 

0 

s 

H(Z) 

R(l) 

BYTE ENABLES select which of the four bytes addressed by A31 :2 are active 
during an access to a memory region configured for a 32-blt data-bus width. BE3 
applies to D31 :24; BE2 applies to D23:1 6; BE1 applies to D1 5:8; and BEO applies 
to D7:0. 

32-bit bus: BE3 -Byte Enable 3 -enable D31:24 

BE2 -Byte Enable 2 -enable D23:1 6 

BE1 -Byte Enable 1 -enable D1 5:8 

BEO -Byte Enable 0 -enable D7:0 

For accesses to a memory region configured for a 1 6-bit data-bus width, the 
processor directly encodes BE3, BE1 and BEO to provided BHE, A1 and BLE 
respectively. 

1 6-bit bus: BE3 -Byte High Enable (BHE) -enable D1 5:8 

BE2 -Not used (is driven high or low) 

BET -Address Bit 1 (A1) 

BEO -Byte Low Enable (BLE) -enable D7:0 

For accesses to a memory region configured for an 8-blt data bus width, the 
processor directly encodes BE1 and BEO to provide A1 and AO respectively. 

8-bit bus: BE3 -Not used (is driven high or low) 

BE2 -Not used (is driven high or low) 

BET -Address Bit 1 (A1) 

BEO -Address Bit 0 (AO) 

W/R 

0 

s 

H(Z) 

R(0) 

WWTE/READ is low (0) for read requests and high (1 ) for write requests. The 

W/R signal changes in the same clock cycle as ADS. It r^alns valid for the entire 
access in non-pipellned regions. In pipelined regions, W/R may not be valid in the 
last cycle of a read access. 

ADS 

0 

s 

H(Z) 

R(1) 

ADDRESS STROBE indicates valid address and the start of a new bus access. 

ADS is asserted for the first clock of a bus access. 

READY 

S(L) 

H(Z) 

R(Z) 

READY is an input which signals the termination of a data transfer. READY Is 
used to indicate that read data on the bus Is valid, or that a write-data transfer has 
completed. The READY signal works in conjunction with the internally 
programmed wait-state generator. If READY is enabled in a region, the pin is 
sampled after the programmed number of wait-states has expired. If the READY 
pin Is deasserted high, wait states will continue to be inserted until READY 
becomes asserted low. This is true for the Nrad. Nrdd. Nwad. and Nwdd wait 
states. The Nxda wait states cannot be extended. 
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Table 2. 80960CA Pin Description-— External Bus Signals (Continued) 


Name 

Type 

Description 

BTERM 

S(L) 

H(Z) 

R(Z) 

BURST TERMINATE — The burst terminate signal breaks up a burst access and 
causes another address cycle to occur. The BTERM signal works In conjunction 
with the internally programmed wait-state generator. If READY and BTERM are 
enabled in a region, the BTERM pin is sampled after the programmed number of 
wait states has expired. When BTERM is asserted, additional wait states are 
inserted until BTERM Is deasserted. When BTERM is deasserted, a new ADS 
signal is generated and the access Is completed. The READY input is ignored 
when BTERM is asserted. BTERM must be externally synchronized to satisfy the 
BTERM setup and hold times. 

WAIT 

0 

s 

H(Z) 

R(1) 

WAIT indicates the status of the internal wait state generator. WAIT is active 
when wait states are being caused by the internal wait state generator and not by 
the READY or BTERM inputs. WAIT can be used to derive a write-data strobe. 

WAIT can also be thought of as a READY output that the processor provides 
when it Is -inserting wait states. 

BLAST 

0 

s 

H(Z) 

R(0) 

BURST LAST indicates the last transfer in a bus access. BLAST Is asserted in the 
last data transfer of burst and non-burst accesses after the wait state counter 
reaches zero. BLAST remains active until the clock following the last cycle of the 
last data transfer of a bus access. If the READY or BTERM Input is used to extend 
wait states, the BLAST signal remains active until READY or BTERM terminates 
the access. 

DT/R 

0 

s 

H(Z) 

R(0) 

DATA TRANSMIT /RECEIVE indicates direction for data transceivers. DT/R is 
used In conjunction with DEN to provide control for data transceivers attached to 
the external bus. When DT/R is low (0), the signal indicates that the proc^sor will 
receive data. Conversely, when high (1 ) the processor will send data. DT/R will 
change only while DEN is high. 

DEN 

0 

s 

H(Z) 

R(1) 

DATA ENABLE indicates data cycles in a bus access. DEN is asserted (low) at 
the start of the first data cycle of a bus request and is deasserted (high) at the end 
of the last data cycle. DEN is used in conjunction with DT/R to provide control for 
data transceivers attached to the external bus. DEN remains asserted fof 
sequential reads from pipelined memory regions. DEN is high when DT/R 
changes. 

LOCK 

0 

s 

H(Z) 

R(1) 

BUS LOCK indicates that an atomic read-modify-write operation Is in progress. 

LOCK may be used to prevent external agents from accessing memory which is 
currently involved in an atomic operation. LOCK is asserted (0) in the first clock of 
an atomic operation, and deasserted In the clock cycle following the last bus 
access for the atomic operation. To allow the most flexibility for a memory system 
enforcement of locked accesses, the processor will acknowledge a bus hold 
request when LOCK is asserted. The processor will perform DMA transfers while 
LOCK Is active. 


S(L) 

H(Z) 

R(Z) 

HOLD REQUEST signals that an external agent requests access to the external 
bus. The processor asserts HOLDA after completing the current bus request. 

HOLD, HOLDA, and BREQ are used together to arbitrate access to the 
processor’s external bus by external bus agents. 


1 

S(L) 

H(Z) 

R(Z) 

BOFF BUS BACKOFF —The backoff pin, when asserted (0), suspends the 
current access and causes the bus pins to float. When the pin is deasserted (1 ), 
the ADS signal is asserted on the next clock cycle and the access Is resumed. 
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Table 2. 80960CA Pin Description — External Bus Signals (Continued) 


Name 

Type 

Description 

HOLDA 

0 

S 

H(1) 

R(Q) 

HOLD ACKNOWLEDGE indicates to a bus requestor that the processor has 
relinquished control of the external bus. When HOLDA is asserted, the external 
address bus, data bus, and bus control signals are floated. HOLD, BOFF, HOLDA 
and BREQ are used together to arbitrate access to the processor’s external bus 
by external bus agents. Since the processor will grant HOLD requests and enter 
the Hold Acknowledge state even while RESET is active, the state of the HOLDA 
pin will be independent of the RESET pin. 

BREQ 

0 

s 

H(Q) 

R(0) 

BUS REQUEST indicates that the processor wishes to perform a bus request. 

BREQ can be used by external bus arbitration logic in conjunction with HOLD and 
HOLDA to determine when to return mastership of the external bus to the 
processor. 

D/C 

O 

s 

H(2) 

R(Z) 

DATA OR CODE indicates that a bus request is a data request (1) or a instruction 
request (0). D/C has the same timing as W/R 

DMA 

0 

s 

H(2) 

R(2) 

DMA ACCESS indicates whether the bus request was initiated by the DMA 
controller. DMA will be asserted (low) for any DMA request. DMA will be 
deasserted (high) for all other requests. 

sDp 

0 

s 

H{Z) 

R(Z) 

SUPERVISOR ACCESS indicates whether the bus request is issued while in 
supervisor mode. SUP will be asserted (low) when the request has supervisor 
privileges, and will be deasserted (high) otherwise. SUP can be used to isolate 
supervisor code and data structures from non-supervisor requests. 


Table 3. 80960CA Pin Description-Processor Control Signals 


Name 

Type 

Description 

RESET 

A(L) 

H(Z) 

R(Z) 

N(Z) 

RESET causes the chip to reset. When RESET is asserted (low), all external signals 
return to the reset state. When RESET is deasserted, Initialization begins. When the 
two-x clock mode is selected, RESET must remain asserted for 16 PCLK2:1 cycles 
before being deasserted In order to guarantee correct initialization of the processor. 
When the one-x clock mode Is selected, RESET must remain asserted for 10,000 
PCLK2:1 cycles before being deasserted In order to guarantee correct initialization of 
the processor. The CLKMODE pin selects one-x or two-x input clock division of the 
CLKIN pin. 

The processor’s Hold Acknowledge bus state functions while the chip is reset. If the 
processor’s bus is in the Hold Acknowledge state when RESET is activated, the 
processor will internally reset, but will maintain the Hold Acknowledge state on 
external pins until the Hold request is rernoved. If a hold request Is made while the 
processor is in the reset state, the processor bus will grant HOLDA and enter the Hold 
Acknowledge state. 

FAIL 

0 

s 

H(Q) 

R(0) 

FAIL indicates failure of the processor’s self-test performed at initialization. When 
RESET is deasserted and the processor begins initialization, the FAIL pin is asserted 
(0). An internal self-test is performed as part of the initialization process. If this self-test 
passes, the FAIL pin is deasserted (1) otherwise it remains asserted. The FAIL pin is 
reasserted while the processor performs and external bus self-confidence test. If this 
self-test passes, the processor deasserts the FAIL pin and branches to the users 
initialization routine, otherwise the FAIL pin remains asserted. Internal self-test and the 
use of the FAIL pin can be disabled with the STEST pin. 
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Table 3. 80960CA Pin Description— Processor Control Signals (Continued) 


Name 

Type 

Description 

STEST 

1 

S(L) 

H(Z) 

R(Z) 

SELF TEST causes the processor’s internal self-test feature to be enabled or 
disabled at initialization. STEST is read on the rising edge of RESET. When asserted 
(high) the processor’s internal self-test and external bus confidence tests are 
performed during processor initialization. When deasserted (low), only the internal 
self-test is not performed during initialization. 

ONCETM 

1 

A(L) 

H(Z) 

R(Z) 

' 

ON CIRCUIT EMULATION causes all outputs to be floated when asserted (low). 

ONCE is continuously sampled while RESET is low, and Is latched on the rising edge 
of RESET. To place the processor in the ONCE state: 

(1 ) assert RESET and ONCE (order does not matter) 

(2) wait for at least 1 6 CLKIN periods in two-x mode, or 1 0,000 CLKIN periods In 

one-x mode, after Vcc CLKIN are within operating specifications 

(3) deassert RESET 

(4) wait at least 32 CLKIN periods 

(The processor will now be latched in the ONCE state as long as RESET is high.) 

To exit the ONCE state, bring Vcc and CLKIN to operating conditions, then assert 
RESET and bring ONCE high prior to deasserting RESET. 

CLKIN must operate within the specified operating conditions of the processor until 
step 4 above has been completed. The CLKIN may then be changed to DC to 
achieve the lowest possible ONCE mode leakage current. 

ONCE can be used by emulator products or for board testers to effectively make an 
Installed processor transparent in the board. 

CLKIN 

A(E) 

H(Z) 

R(Z) 

CLOCK INPUT is an input for the external clock needed to run the processor. The 
external clock is internally divided as prescribed by the CLKMODE pin to produce 
PCLK2;1. 

CLKMODE 

1 

A(L) 

H(Z) 

R(Z) 

CLOCK MODE selects the division factor applied to the external clock input (CLKIN). 
When CLKMODE Is high (1), CLKIN is divided by one to create PCLK2:1 and the 
processor’s internal clock. When CLKMODE is low (0), CLKIN is divided by two to 
create PCLK2:1 and the processor’s internal clock. CLKMODE should be tied high, or 
low in a system, as the clock mode is not latched by the processor. If left 
unconnected, the processor will internally pull the CLKMODE pin low (0), enabling the 
two-x clock mode. 

PCLK2 

PCLK1 

0 

s 

H(Q) 

R(Q) 

PROCESSOR OUTPUT CLOCKS provide a timing reference for all Inputs and 
outputs of the processor. All inputs and output timings are specified in relation to 

PCLK2 and PCLK1 . PCLK2 and PCLK1 are identical signals. Two output pins are 
provided to allow flexibility in the system’s allocation of capacitive loading on the 
clock. PCLK2:1 may also be connected at the processor to form a single clock signal. 

Vss 


GROUND connections consist of 24 pins which must be connected externally to a 

Vss board plane. 

Vcc 

- 

POWER connections consist of 24 pins which must be connected externally to a Vcc 
board plane. 

N/C 

- 

NO CONNECT pins must not be connected in a system. 
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Table 4. 80960CA Pin Description— DMA and Interrupt Unit Control Signals 


Name 

Type 

Description 

DREQ3 

DREQ2 

DREQ1 

DREQO 

A(L) 

H(Z) 

R(Z) 

DMA REQUEST causes a DMA transfer to be requested. Each of the four signals 
requests a transfer on a single channel. DREQO requests channel 0, DREQ1 
requests channel 1 , etc. When two or more channels are requested simultaneously, 
the channel with the highest priority is serviced first. The channel priority mode is 
programmable. 

DACK3 

DACK2 

DACK1 

DACKO 

0 

s 

H(1) 

R{1) 

DMA ACKNOWLEDGE indicates that a DMA transfer is being executed. Each of the 
four signals acknowledges a transfer for a single channel. DACKO acknowledges 
channel 0, DACK1 acknowledges channel 1 , etc. DACK3:0 are active (0) when the 
requesting device of a DMA is accessed. 

EOP3/TC3 

EOP2/TC2 

EOP1/TC1 

EOPO/TCO 

I/O 

A(L) 

H(Z/Q) 

R(Z) 

END OF PROCESS/TERMINAL COUNT can be programmed as either an input 
(EOP3:0) or as an output (TC3:0), but not both. Each pin is individually 
programmable. When programmed as an Input, EOPx causes the termination of a 
current DMA transfer for the channel corresponding to the EOPx pin. EOPO 
corresponds to channel 0, EOP1 corresponds to channel 1 , etc. When a channel is 
configured for source destination chaining, the EOP pin for that channel causes 

termination of only the current buffer transferred and causes the next buffer to be 
transferred. EOP3:0 are asynchronous inputs. 

When programmed as an output, the channel’s TCx pin indicates that the channel 
byte count has reached 0 and a DMA has terminated. TCx is driven with the same 
timing as DACKx during the last DMA transfer for a buffer. If the last bus request is 
executed as multiple bus accesses, TCx will stay asserted for the entire bus request. 

XINT7 

XINT6 

XINT5 

XINT4 

XINT3 

XINT2 

XINT1 

XINTO 

1 

A(E/L) 

H(Z) 

R(Z) 

EXTERNAL INTERRUPT PINS cause interrupts to be requested. These pins can be 
configured in three modes. 

In the Dedicated Mode, each pin is a dedicated external interrupt source. Dedicated 
inputs can be individually programmed to be level (low) or edge (falling) activated. 

In the Expanded Mode, the 8 pins act together as an 8-bit vectored interrupt source. 
The interrupt pins in this mode are level activated. Since the interrupt pins are active 
low, the vector number requested is the one’s complement of the positive logic value 
place on the port. This eliminates glue logic to interface to combinational priority 
encoders which output negative logic. 

In the Mixed Mode, XINT7:5 are dedicated sources and XINT4:0 act as the 5 most 
significant bits of an expanded mode vector. The least significant bits are set to 010 
internally. 

Tm 

1 

A(E) 

H(Z) 

R(Z) 

NON-MASKABLE INTERRUPT causes a non-maskable interrupt event to occur. 

NMI is the highest priority interrupt recognized. NMI is an edge (falling) activated 
source. 
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3.3. 80960CA Pinout 

3.3.1 80960CA CPGA PINOUT 

Tables 5 and 6 list the 80960CA pin names with 
package location. Figure 4-a depicts the complete 


80960CA pinout as viewed from the top side of the 
component (i.e., pins facing down). Figure 4b shows 
the complete 80960CA pinout as viewed from the 
pin-side of the package (i.e., pins facing up). See 
Section 4.0, Electrical Specifications for specifica- 
tions and recommended connections. 


Table 5. PGA Pin Name with Package Location (Signal Order) 


Address Bus 

Data Bus 

Bus Control 

Processor Control 

I/O 

Name . . Location 

Name . . Location 

Name . . Location 

Name — Location 

Name . . Location 

A31 S15 

D31 R03 

^ SOS 

RESET A16 

DREQ3 A07 

A30 Q13 

D30 QOS 

S06 


DREQ2 B06 

A29 R14 

D29 S02 

S07 

FMl.. A02 

DREQ1 A06 

A28 Q14 

D28 Q04 

^ R09 


DREQO BOS 

A27 SI 6 

D27 R02 


STEST B02 


A26 R15 

D26 Q03 

W/R S10 


DACK3 A10 

A25 S17 

D25 S01 


ONCE C03 

DACK2 A09 

A24 Q15 

D24 R01 

ADS R06 


DACK1 A08 

A23 R16 

D23 Q02 


CKLIN Cl 3 

DACKO B08 

A22 .R17 

D22 ...P03 

READY S03 

CLKMODE ....Cl 4 


A21 Q16 

D21 Q01 

BTERM.....R04 

PCLK1 B14 

EOP/TCO...A11 

A20 PI 5 

D20 P02 


PCLK2 B13 

EOP/TCT...A12 

A19 P16 

D19 P01 

WAIT SI 2 


EOP/TC2...A13 

A18 Q17 

D18 N02 

BLAST S08 

Vss 

EOP/TC3...A14 

A17 P17 

D17 ..N01 


Location 


A16 N16 

D16 M01 

DT/R... ....S11 

C07.C08,C09, 

CIO, C11,C12, 

FI 5, G03, G15, 
H03,.H15, J03, 

J15, K03, K15, 

L03, L15, M03,. 

MIS, 007, 008, 

009, 01 0, 011 

XINT7 Cl 7 

A15 .N17 

D15 L01 

DEN S09 

XINT6 C16 

A14 M17 

D14 L02 


XINTS .B17 

A13 L16 

D13 KOI 

LOCK SI 4 

XINT4 CIS 

A12 L17 

D12 .J01 


XINT3 B16 

All K17 

Dll HOI 

HOLD R05 

XINT2 A17 

A10 J17 

DIO H02 

HOLDA S04 

Vcc 

XINT1 A1S 

A9 ..HI 7 

D9 G01 

BREQ R13 

Location 

XINTO BIS 

A8 G17 

D8 F01 


B07, B09, BIO, 
B11,B12, C06, 

E1S, F03, F16, 

G02, H16, J02, 

J16, K02, K16, M02, 
M16, N03, N1S, 

006, R07, R08, 

RIO, R11 


A7 G16 

D7 E01 

D/C SI 3 

WA\ D1S 

A6 F17 

D6 F02 

WiA R12 


A5 .E17 

D5 D01 

^ 012 


A4 E16 

D4 E02 



A3 D17 

D3 C01 

BOFF B01 

No Connect 


A2 D16 

D2 D02 


Location 



D1 C02 


A01, A03, A04, AOS, 
B03, B04, C04, COS, 
D03 



DO E03 
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Table 6. PGA Pin Name with Package Location (Pin Order) 


Address Bus 

Data Bus 

Bus Control 

Processor Control 

I/O 

Location . . Name 

Location . . Name 

Location . . Name 

Location Name 

Location . . Name 

A01 NC 

C01 D3 

G01 D9 

M01 D16 

R01 .D24 

A02 fm: 

C02 D1 

G02 Vcc 

M02 Vcc 

R02 D27 

A03 NC 

C03 ONCE 

G03 Vss 

M03 Vss 

R03 D31 

A04 NC 

C04 NC 

G1S Vss 

M1S Vss 

R04 BTERM 

AOS NC 

COS NC 

G1S A7 

MIS Vcc 

ROS HOLD 

A06 DREQ1 

C06 Vcc 

G17 AS 

M17 A14 

ROS ADS 







COS Vss 

HOI Dll 


R08 Vcc 

A09 DACK2 

C09 Vss 

H02 DIO 

N02 D18 

R09 BEO 

A10 DACK3 

CIO Vss 

H03 Vss 

NOS Vcc 


All ...EOP/TCO 

C11 Vss 

HIS Vss 

N1S Vcc 

R11 Vcc 

A12...EOP/TC1 

Cl 2 Vss 

HIS Vcc 

N1S A1S 

R12 DMA 

A13...EOP/TC2 

C13 CLKIN 

H17 A9 

N17 A15 

R13 BREQ 

A14...EOP/TC3 

C14..CLKMODE 



R14 A29 

A15 XINT1 

CIS XINT4 

J01 D12 

P01 D19 

R1S A28 

A16 RESET 

CIS XINTS 

J02 Vcc 

P02 D20 

RIS A23 

mMmmmAwu ■ 


J03 Vss 


R17 A22 



J15 Vss 





J16 Vcc 

PIS A19 

SOI D25 



J17 A10 

P17 A17 

S02 D29 

B03 NC 

D03 NC 



SOS READY 

B04 NC 

D1S NMi 

KOI D13 

Q01 D21 

S04 HOLDA 

BOS DREQO 

D1S A2 

K02 Vcc 

Q02 D23 

SOS BE3 

BOS DREQ2 

D17 A3 

K03 Vss 

Q03 D26 

SOS BE2 

B07 Vcc 


K1S Vss 

Q04 D28 

S07 BET 

BOS DACKO 

E01 D7 

K1S Vcc 

QOS D30 

SOS BLAST 

B09 Vcc 

E02 D4 

K17 All 

Q06 Vcc 

S09 DEN 

BIO Vcc 

E03 DO 


Q07 Vss 

S10 W/R 

B11 Vcc 

E1S Vcc 

L01 D1S 

Q08 Vss 

S11 DT/R 

B12 Vcc 

E16 A4 

L02 D14 

Q09 Vss 


B13 PCLK2 

E17 AS 

LOS Vss 

Q10 Vss 


B14 PCLK1 


LIS Vss 

Q11 Vss 


BIS XINTO 

F01 D8 

L1S A13 


BsCISHHHKIcn 

BIS XINT3 

F02 DS 

L17 A12 


■U11—I.I m 

B17 XINTS 
















F17 AS 


Q17 A18 
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READY D31 D26 

HOLDA BTERM 1^8 


D24 

D21 

D19 

D17 

D16 

D15 

D13 

D12 

Dll 

D9 

D8 

D7 

D5 

D27 


1^0 

D18 

''cc 

D14 

''cc 

''cc 

DIO 

^'cc 

D^B 

D4 

D2 

D31 

1^6 

022 

^cc 


'^SS 

''ss 

''ss 


''ss 


DO 

NC 


WAIT DMA SUP 
D/C BREQ ^0 


C 

B 

A 


D3 

BQFF 

NC 

D1 

STEST 


SfeE 

NC 

^NC 

NC 

NC 

'"nc 

NC 

DREQO 

'^NC 

Vcc 

DREQ2 

DREQ1 


'^CC 

DREQ3 




''ss 

DACKO 

^CKT 




^ss 

''cc 

BSCK2 


^'cc 

^ACK3 

'^ss. 

'^CC 

EOP/fCO 

4 

'^CC 

EOP/fc'l 

CLKIN 

PCLK2 

EOP/f^ 


CLKMODE PCLK1 




^ . 












a'20 

'^CC 

''ss 

'^SS 

'^SS 

''ss 

''ss 

''ss 

''ss 

^CC 


XiNT4 

mfo 

'xii^i 

aTs 

MB 

^cc 

M3 

^'cc 

^'cc 

''cc 

A7 

() o 
> 

A4 


xlr^ 

XINT3 

RESET 

aT7 

MS 

M4 

M2 

aTi 

aTo 

A9 

AS 

AS 

AS 

A^ 

XiNt7 

XINT5 

'^iNT2 


SRQPNMLKJHG FEDCBA 


Figure 4a. 80960CA PGA Pinout (View from Top Side) 
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3.3.2 80960CA PQFP Pinout Section 4.0, Eiectricai Specifications for 

specifications and recommended connections. 

Tables 7 and 8 list the 80960CA pin names with 
package location. 


Table 7. PQFP Pin Name with Package Location (Pin Order) 


Address Bus 

Data Bus 

Bus Control 

Processor Control 

I/O 

Name . . Location 

Name . . Location 

Name . . Location 

Name Location 

Name . . Location 

A31 153 

D31 186 

W3 176 

RESET 091 

DREQ3 060 

A30 152 

D30 187 

BE2 175 


DREQ2 059 

A29 151 

D29 188 

BET 172 

FM: 045 

DREQ1 058 

A28 145 

D28 189 

BEO 170 


DREQO 057 

A27 144 

D27 191 


STEST 046 


A26 143 

D26 192 

W/R 164 


DACK3 065 

A25 142 

D25 194 


ONCE 043 

DACK2 064 

A24 141 

D24 195 

ADS 178 


DACK1 063 

A23 139 

D23 003 


CLKIN 087 

DACKO 062 

A22 138 

D22 004 

READY 182 

CLKMODE 085 


A21 137 

D21 005 

BTERM 184 

PCLK1 078 

EOP/TC3 . . .069 

A20 136 

D20 006 


PCLK2 074 

EOP/TC2 ...068 

A19 134 

D19 008 

WAIT 162 


EOT/TCT . . .067 

A18 133 

D18 009 

BLAST 169 

Vss 

EOP/TCO . . .066 

A17 132 

D17 010 


Location 


A16 ........130 

D16 oil 

DT/R 163 

2, 7. 16, 24, 30, 38, 

39, 49, 56. 70, 75, 

77, 81,83, 88, 89, 

92, 98,105,109,110, 

121, 125, 131, 135, 

147, 150, 161, 165, 

173, 174, 185, 196 

XINT7 107 

A15 129 

D15 013 

D^ 167 

XINT6 106 

A14 ..128 

D14 014 


XINT5 102 

A13 124 

D13 015 

LOCK 156 

XINT4 101 

A12 123 

D12 017 


XINT3 100 

All 122 

Dll 018 

HOLD 181 

XINT2 095 

A10 120 

DIO 019 

HOLDA 179 

Vcc 

XINT1 Q94 

> 

CD 

CO 

D9 021 

BREQ 155 

Location 

XINTO 093 

A8 118 

D8 022 


1,12, 20, 28, 

32, 37, 44, 50, 

61,71,72, 79, 

82, 96, 99, 103, 
115,127,140,148, 
154,168, 171,180, 

190 


A7 117 

D7 023 

D/C 159 

NMi 108 

A6 116 

D6 025 

Dl^ 160 


A5 114 

D5 026 

^ 158 


A4 113 

D4 027 



A3 112 

D3 033 

BOFF 040 

No Connect 


A2 111 

D2 034 


Location 



D1 035 


29, 41,42, 47, 

48,51,52, 53, 

54, 55, 73, 76, 

80, 84, 86, 90, 97, 

104, 126, 146, 149, 157, 
166, 177, 183, 193 



DO 036 
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Signal 

Vcc 

Vss 

D23 

D22 

D21 

D20 

VSS 

D19 

D18 

D17 

D16 

Vcc 

D15 

D14 

D13 

Vss 

D12 

D11 

D10 

Vcc 

D9 

D8 

D7 

Vss 

D6 

D5 

D4 

Vcc 

NC 

Vss 

NC 

Vcc 

D3 

D2 

D1 

DO 

Vcc 

Vss 

Vss 

BOFF 

NC 

NC 

ON^ 

Vcc 

FAIL 

STEST 

NC 

NC 

Vss 


Tab le 8. PQFP 

Zl Pin 

_ 50 

_ 51 

52 

53 

_ 54 

_ 55 

_ 56 

_ ^7 

— 56 

_ 59 

_ 60 

_ 61 

_ 62 

_ 63 

_ 64 

_ 65 

66 

_ 67 

_ 68 

_ 69 

_ 70 

_ 71 

_ 72 

73 

_ 74 

75 

_ 76 

_ 77 

_ 78 

_ 79 

_ 80 

_ 81 

_ 82 

_ 83 

_ 84 

_ 85 

_ 86 

_ 87 

_ 88 

_ 89 

_ 90 

_ 91 

92 


Pin Name with 
Signal 

Vcc 

NC 

NC 

NC 

NC 

NC 

Vss 

DREQO 


D 

DREQ2 
DREQ3 
Vcc 
DACKO 
DACK1 
DACK2 
DACK3 
EOPO/TCO 
EOm/T Ci 
EOP2/' 
EOP3r 
Vss 
Vcc 
Vcc 
NC 
PCLK2 

Vss 

NC 

Vss 
PCLK1 
Vcc 
NC 
Vss 
Vcc 
Vss 
NC 

CLKMC 
NC 

CLKIN 
Vss 
Vss 

NC 

RESET 
Vss 
XINTO 
XlNTT 
XiRT2 
Vcc 
NC 
Vss 


P ackage Location (Pin 
Pin Signal 

99 Vcc 

100 XINT3 

101 


Ord er) 

_ Pin 

148 

_ 149 

154 
1^ 
56 




Signal 

Vcc 

NC 

Vss 

A29 

A30 

A31 

Vcc 

BREQ 

LOCK 
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3.4. Mechanical Data 


3.4.1 CERAMIC PGA PACKAGE 
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Table 9. Ceramic PGA Package Dimension Symbols 


Letter or 

Symbol 

Description of Dimensions 

A 

Distance from seating plane to highest point of body 

Ai 

Distance between seating plane and base plane (lid) 

A2 

Distance from base plane to highest point of body 

As 

Distance from seating plane to bottom of body 

B 

Diameter of terminal lead pin 

D 

Largest overall package dimension of length 

Di 

A body length dimension, outer lead center to outer lead center 

ei 

Linear spacing between true lead position centerlines 

L 

Distance from seating plane to end of lead 

S1 

Other body dimension, outer lead center to edge of body 


NOTES: 

1 . Controlling dimension: millimeter. 

2. Dimension “ei” (“e”) is non-cumulative. 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-0.0430 inch. 

4. Dimensions “B”, “Bi” and “C” are nominal. 

5. Details of Pin 1 identifier are optional. 
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3.4.2 PLASTIC QUAD FLAT PACKAGE 



Figure 6. Principal Dimensions and Datums 



Figure 7. Molded Details 



Figure 8. Detail M 






Figure 9. Terminal Details 



Figure 10. Typical Lead 


Table 10. PQFP Package Dimension Symbols 


Symbol 

Description 

Min 1 

Max I 

I Min 1 

Max 

N 

Leadcount 

196 

196 

A 

Package Height 

0.160 

0.170 

4.06 

4.32 

A1 

Standoff 

0.020 

0.030 

0.51 

0.76 

D,E 

Terminal Dimension 

,1.475 

1.485 

37.47 

37.72 

D1,E1 

Package Body 

1.347 

1.353 

34.21 

34.37 

D2, E2 

Bumper Distance 

1.497 

1.503 

38.02 

38.18 

D3. E3 

Lead Dimension 

1.200 REF I 

30.48 REF 

D4, E4 

Foot Radius Location 

1.423 I 

1.437 

36.14 

36.49 

LI 

Foot Length 

0.020 

0.030 

0.51 

0.76 

Dimensior 

1 


INCH 

mm 


NOTES: 

1. All dimensions and tolerances conform to ANSI Y14.5M-1982. 

2. Datum plane -H- located at the mold parting line and coincident with the bottom of the lead where lead exits plastic body. 

3. Datums A-B and -D- to be determined where center leads exit plastic body at datum plane -H-. 

4. Controlling Dimension, Inch. 

5. Dimensions D1, D2, E1 and E2 are measured at the mold parting line. D1 and E1 do not include an allowable mold 
protrusion of 0.18 mm (0.007 in) per side. D2 and E2 do not include a total allowable mold protrusion of 0.18 mm (0.007 in) 
at maximum package size. 

6. Pin 1 identifier is located within one of the two zones indicated. 

7. Measured at datum plane -H-. 

8. Measured at seating plane datum -C-. 
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3.5. Package Thermal Specifications 

The 80960CA is specified for operation when Tc 
(the case temperature) is within the range of 0°C- 
lOO^C. Tc may be measured in any environment to 
determine whether the 80960CA is within specified 
operating range. The case temperature should be 
measured at the center of the top surface, opposite 
the pins. Refer to Figure 13. 

Ta (the ambient temperature) can be calculated 
from Oca (thermal resistance from case to ambient) 
with the following equation: 


Table 11 shows the maximum Ta allowable (without 
exceeding Tc) at various airflows and operating fre- 
quencies (fpCLK)- 

Note that Ta is greatly improved by attaching fins or 
a heat sink to the package. P (the maximum power 
consumption) is calculated by using the typical Ice 
as tabulated in Section 4.4, DC Characteristics, 
and Vcc of 5V. 


Ta = Tc - P*^CA 


Table 11. Maximum Ta at Various Airflows In ’’C (PGA Package Only) 




Airflow-ft/min (m/sec 



fpCLK 

(MHz) 

0 

(0) 

200 

(1-01) 

400 

(2.03) 

600 

(3.04) 

800 

(4.06) 

1000 

(5.07) 

Ta 

33 

51 

66 

79 

81 

85 

87 

with 

25 

61 

73 

83 

85 

88 

89 

Heat Sink* 

16 

74 

82 

89 

90 

92 

93 

Ta 

33 

36 

47 

59 

66 

73 

75 

without 

25 

49 

58 

67 

73 

78 

80 

Heat Sink 

16 

66 

72 

78 

82 

86 

87 


* 0.285" high unidirectional heat sink (Al alloy 6061, 50 mil fin width, 150 mil center-to-center fin spacing). 


PGA Thermal Resistance — ^“C/Watt 

Parameter 

Airflow— ft./min (m/sec) 

0 

(0) 

200 

(1.01) 

400 

(2.03) 

600 

(3.07) 

800 

(4.06) 

1000 

(5.07) 

0 Junction-to-Case 
(Case Measured 
as shown in Figure 13) 

1.5 

1.5 

1.5 

1.5 

1.5 

1.5 

0 Case-to-Ambient 
(No Heatsink) 

17 

14 

11 

9 

7.1 

6.6 

0 Case-to-Ambient 
(with Unidirectional) 
Heatsink)* 

13 

9 

5.5 

5.0 

3.9 

3.4 



NOTES: 

1. This table applies to 80960CA PGA plugged into socket or soldered directly 
into board. 

2. 0JA = ^JC + ^CA- 

3. ^j-CAP = 4“C/W (approx.) 

^J-PIN == 4°C/W (inner pins) (approx.) 

^J-PIN = S^C/W (outer pins) (approx.) 

* 0.285" high unidirectional heat sink (Al alloy 6061, 50 mil fin width, 150 mil 
center-to-center fin spacing). 


Figure 1 1. 80960CA PGA Package Thermal Characteristics 
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PQFP Thermal Resistance — ^“C/Watt 


Parameter 

6 Junctlon-to-Case 
(Case Measured) 
as shown in Figure 13) 

0 Case-to-Ambient 
(No Heatsink) 


Airflow— ft/min (m/sec) 

0 50 100 200 400 600 800 

(0) (0.25) (0.50) (1.01) (2.03) (3.04) (4.06) 


19 18 17 15 12 10 


NOTES: 

1 . This table applies to 80960CA PQFP soldered directly into board. 

2. ^JA = ^JC + ^CA- 

3. 0JL = 18“C/Watt 
0JB = 18“C/Watt 



Figure 12. 80960CA PQFP Package Thermal Characteristics 


. MEASURE PGA CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 



- MEASURE PQFP TEMPERATURE AT 
CENTER OF TOP SURFACE 


1 68 - PIN PGA 



Figure 13. Measuring 80960CA PGA and PQFP Case Temperature 
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3.6 Stepping Register information 

Upon Reset, Register GO contains die stepping in- 
formation. The following figure shows how GO is 
configured. The most significant byte contains an 
ASCII 0. The upper middle byte contains an ASCII C. 
The lower middle byte contains an ASCII A. The 
least significant byte contains the stepping number 
in ASCII. GO retains this information until it is written 
over by the user program. 


3.7 Suggested Sources for 80960CA 
Accessories 

The following are some suggested sources of ac- 
cessories for the 80960CA. They are not an en- 
dorsement of any kind, nor a warranty of the per- 
formance of any of the listed products and/or com- 
panies. 

Sockets 


Table 12 contains a cross reference of the number 
in the least significant byte of register GO to the die 
stepping number. 


00 

43 

41 

Stepping Number 

0 

c 

A 

Stepping Number 


MSB LSB 


Figure 14. Register GO 
Table 12. Die Stepping Cross Reference 


GO Least 
Significant Byte 

Die Stepping 

01 

B 

02 

C-1 

03 

C-2 

04 

D 


1. 3M Textool Test and Interconnection Products 
Department 

P.O. Box 2963 
Austin, TX 78769-2963 

2. Augat, Inc. 

Interconnection Products Group 
33 Perry Avenue 
P.O. Box 779 
Attleboro, MA 02703 
(508) 222-2202 

3. Concept Manufacturing Inc. 

(Decoupling Sockets) 

43024 Christy Street 
Fremont, CA 94538 
(415) 651-3804 


Heat Sinks/Fins 

1. Thermalloy, Inc. 

2021 West Valley View Lane 
Dallas, TX 75381-0839 
(214) 243-4321 


2. E G & G Division 
60 Audubon Road 
Wakefield, MA 01880 
(617) 245-5900 
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4.0 ELECTRICAL SPECIFICATIONS 


4.1 Absolute Maximum Ratings 


Parameter 

Maximum Rating 

Storage Temperature 

Case Temperature Under Bias 
Supply Voltage wrt. Vss 

Voltage on Other pins wrt Vss 

-65°Cto +150°C 

-65°Cto +110“C 

-0.5 V to +6.5V 
-0.5V to Vcc +0-5V 


NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


4.2. Operating Conditions 


Operating Conditions (80960CA-33, -25, -16) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

Vcc 

Supply Voltage 

80960CA-33 

4.75 

5.25 





80960CA-25 

4.50 

5.50 

V 




80960CA-16 

4.50 

5.50 



feLK2x 

Input Clock Frequency (2-x Mode) 

80960CA-33 

0 

66 

MHz 




80960CA-25 

0 

50 

MHz 




80960CA-16 

0 

32 

MHz 


feLKIx 

Input Clock Frequency (1-x Mode) 

80960CA-33 

8 

33 

MHz 




80960CA-25 

8 

25 

MHz 

(1) 



80960CA-16 

8 

16 

MHz 


Tc 

Case Temperature Under Bias 

PGA Package 

0 

100 

®C 



80960CA-33, -25, -16 

196-Pin PQFP 

0 

100 




NOTE: 

(1) When in the 1-x input clock mode, CLKIN is an input to an internal phase-locked loop and must maintain a minimum 
frequency of 8 MHz for proper processor operation. However, in the 1-x Mode, CLKIN may still be stopped when the 
processor either is in a reset condition or is reset. If CLKIN is stopped, the specified RESET low time must be provided once 
CLKIN restarts and has stabilized. 


4.3 Recommended Connections 

Power and ground connections must be made to 
multiple Vcc and Vss (GND) pins. Every 80960CA- 
based circuit board should include power (Vcc) and 
ground (Vss) planes for power distribution. Every 
Vcc pin must be connected to the power plane, and 
every Vss pin must be connected to the ground 
plane. Pins Identified as “N.C.” must not be con- 
nected in the system. 

Liberal decoupling capacitance should be placed 
near the 80960CA. The processor can cause tran- 
sient power surges when Its numerous output buff- 
ers transition, particularly when connected to large 
capacitive loads. 


Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and decou- 
pling capacitors as much as possible. Capacitors 
specifically designed for PGA packages will offer the 
lowest possible inductance. 

For reliable operation, always connect unused In- 
puts to an approp riate s ignal level. In p articula r, any 
unused interrupt (XINT, NMI) or DMA (DREQ) input 
should be con nected to Vcc through a pull-up resis- 
tor, as should BTERM if not used. Pull-up resistors 
should be in th e range of 20 Kfl for each pin tied 
high. If READY or HOLD are not used, the unused 
input should be connected to ground. N.C. pins 
must always remain unconnected. Refer to the 
80960CA User’s Manual for more information. 
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4.4. DC Specifications 

DC Characteristics 


(80960CA-33, -25, -16 under the conditions described in Section 4.2, Operating Conditions.) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

V|L 

Input Low Voltage for all pins except RESET 

-0.3 

0.8 

V 


V|H 

Input High Voltage for all pins except RESET 

2.0 

Vcc + 0.3 

V 


VoL 

Output Low Voltage 


0.45 

V 

IOL= 5 mA 

VoH 

Output High Voltage Iqh = “1nriA 

2.4 


,v 



lOH -200fxA 

Vcc - 0.5 


V 


V|LR 

Input Low Voltage for RESET 

- 0.3 

1.5 

V 


V|HR 

Input High Voltage for RESET 

3.5 

Vcc + 0.3 

V 


•lh 

Input Leakage Current for each pin except 
BTERM, ONCE, DREQ3:0, STEST, 






EOP3:0/TC3:0, NMI, XINT7:0, 

READY, HOLD, BOFF, CLKMODE 


±15 

jliA 

0V^V|N^Vcc(1) 

■lI2 

Input Leakage Current for: 

BTERM, ONCE, DREQ3:0, STEST, 






EOP3:0/TC3:0, NMI, XINT7:0, BOFF 

0 

-300 

fxA 

ViN = 0.45V (2) 

•lI3 

Input Leakage Current for: 

READY, HOLD, CLKMODE 

0 

500 

julA 

V|N = 2.4V (3) 

■lo 

Output Leakage Current 


±15 

jilA 

0.45V < VouT ^ Vcc 

Ice 

Supply Current (80960CA-33) 






Icc Max 


900 

mA 

(4) 


•cc Typ 


750 


(5) 

Icc 

Supply Current (80960CA-25) 






Icc Max 


750 

mA 

(4) 


Icc Typ 


600 


(5) 

•cc 

Supply Current (80960CA-16) 






Icc Max 


550 

mA 

(4) 


Icc Typ 


400 


(5) 

iONCE 

ONCE-mode Supply Current 


100 

mA 


C|N 

Input Capacitance for: 

CLKIN, RESET, ONCE, 

READY, HOLD, DREQ3:0, BOFF 






XINT7:0, NMI, BTERM, CLKMODE 

0 

12 

PF 

Fc = 1 MHz 

Gout 

Output Capacitance of each output pin 


12 

PF 

Fc = 1 MHz, (6) 

C|/0 

I/O Pin Capacitance 


12 

pF 

Fc = 1 MHz 


NOTES: 

(1) No Pull-up or pull-down. 

(2) These pins have internal pullup resistors. 

(3) These pins have internal pulldown resistors. 

(4) Measured at worst case frequency, Vcc and temperature, with device operating and outputs loaded to the test conditions 
described in Section 4.5.1, AC Test Conditions. 

(5) Ice Typical is not tested. 

(6) Output Capacitance is the capacitive load of a floa ting ou tput. 

(7) CLKMODE pin has a pull down resistor only when ONCE pin is deasserted. 
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4.5 AC Specification 


AC Characteristics — 80960CA-33 

(80960CA-33 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

INPUT CLOCK(IO) 

Tf 

GLKIN Frequency 

0 

66 

MHz 

(1) 

Tc 

CLKIN Period 

In One-X Mode (fcLKix) 

30.3 

125 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

15.15 

00 

ns 

(1) 

Tcs 

CLKIN Period Stability 

In One-X Mode (fcLKix) 


±0.1% 

A 

(1.13) 

Tch 

CLKIN High Time 

In One-X Mode (fci kix) 

6 

62.5 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

6 

00 

ns 

(1) 

Tcl 

CLKIN Low Time 

In One-X Mode (fni kiy) 

6 

62.5 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

6 

00 

ns 

(1) 

Tcr 

CLKIN Rise Time 

0 

6 

ns 

(1) 

Tcf 

CLKIN Fall Time 

0 

6 

ns 

(1) 

OUTPUT CLOCKS(9) 

TcP 

CLKIN to PCLK2:1 Delay 

In One-X Mode (fci kiy) 

-2 

2 

ns 

(1.3,13,14) 



In Two-X Mode (fcLK2x) 

2 

25 

ns 

(1.3) 

T 

PCLK2:1 Period 

In One-X Mode (fni kiy) 

Tc 

ns 

(1.13) 



In Two-X Mode (fcLK2x) 

2Tc 

ns 

(1.3) 

TpH 

PCLK2:1 High Time 

(T/2) - 2 

T/2 

ns 

(1.13) 

TpL 

PCLK2:1 Low Time 

(T/2) - 2 

T/2 

ns 

(1.13) 

TpR 

PCLK2:1 Rise Time 

1 

4 

ns 

(1.3) 

TpF 

PCLK2:1 Fall Time 

1 

4 

ns 

(1.3) 

SYNCHRONOUS OUTPUTS(IO) 

Tov 

Output Valid Delay, Output Hold 




(6.11) 

Toh 

T0VI.T0HI 

A3 1:2 

3 

14 

ns 



ToV2. ToH2 

BE3:0 

3 

16 

ns 



ToV3. ToH3 

ADS 

6 

18 

ns 



ToV4.ToH4 

W/R 

3 

18 

ns 



ToV5. ToH5 

D/C, SUP, DMA 

4 

16 

ns 



ToV6. ToH6 

BLAST, WAIT 

5 

16 

ns 



ToV7. ToH7. 

DEN 

3 

16 

ns 



ToV8. ToH8 

HOLDA, BREQ 

4 

16 

ns 



ToV9. ToH9 

LOCK 

4 

16 

ns 



T0VIO. T0HIO 

DACK3:0, EOP3:0/TC3:0 

3 

18 

ns 



T0VII. T0HII 

D31:0 

3 

16 

ns 



TOVI2. T0HI2 

DT/R 

T/2 + 3 

T/2 + 14 

ns 



TOVI3. T0HI3 

FAIL 

2 

14 

ns 

(6,11) 

Top 

Output Float for all outputs 

3 

22 

ns 

(6) 

SYNCHRONOUS INPUTSd®) 

T|s 

Input Setup 

T|S1 

D31:0 

3 


ns 

(1.11) 


^182 

BOFF 

17 


ns 

(1.11) , 


T|S3 

BTERM/READY 

7 


ns 

(1.11) 


T|S4 

HOLD 

7 


ns 

(1.11) 

T|h 

Input Hold 

T|H1 

D31:0 

5 


ns 

(1.11) 


T|H2 

BOFF 

5 


ns 

(1.11) 


T|H3 

BTERM/READY 

2 


ns 

(1.11) 


TiH4 

HOLD 

3 


ns 

(1.11) 
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AC Characteristics — 80960CA-33 


80960CA-33 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

RELATIVE OUTPUT T1MINGS(9.7) 

Tavshi 

A31 :2 Valid to ADS Rising 

T - 4 

T + 4 

ns 


TaVSH2 

BE3:0. W/R, SUP, D/C, 

DMA, DACK3:0 Valid to ADS Rising 

T - 6 

T + 6 

ns 



A31:2 Valid to DEN Falling 

T - 4 

T + 4 

ns 



BE3:0, W/R, SUP. INST, 

DMA, DACK3:0 Valid to DEN Falling 

T - 6 

T + 6 

ns 


Tnlqv 

WAIT Falling to Output Data Valid 

±4 

ns 


Tdvnh 

Output Data Valid to WAIT Rising 

N*T - 4 

N*T + 4 

ns 

(4) 


WAIT Falling to WAIT Rising 

N*T ± 4 

ns 

mm 


Output Data Hold after WAIT Rising 

(N + 1)*T-4 

(N + 1) *T + 4 

ns 



DT/R Hold after DEN High 

T/2 - 4 

oo 

ns 



DT/R Valid to DEN Falling 

T/2 - 4 

T/2 + 4 

ns 


RELATIVE INPUT TIMINQS(7) 

TiS5 

RESET Input Setup 

6 


ns 

(15) 

T|H5 

RESET Input Hold 

5 


ns 

(15) 

T|S6 

DREQ3:0 Input Setup 

12 


ns 

(8) 

T|H6 

DREQ3:0 Input Hold 

7 


ns 

(8) 

T|S7 

XINT7:0, NMI Input Setup 

7 


ns 

(15) 

T|H7 

XINT7:0, NMI Input Hold 

3 


ns 

(15) 


NOTES: 

(1) See Section 4.5.2, AC Timing Waveforms for waveforms and definitions. 

(2) See Figure 22 for capacitive derating information for output delays and hold times. 

(3) See Figure 23 for capacitive derating information for rise and fall times. 

(4) Where N is the number of Nrad. Nrdd. Nwad. or Nw pp wait states that are programmed in the Bus Controller Region 
Table. When there are no wait states in an a ccess. W AIT never goes active. 

(5) N = Number of wait sta^s inserted with READY. 

(6) Output Data and/or DT/R may be driven indefinitely following a cycle if there is no subsequent bus activity. 

(7) See Notes 1 , 2 and 3. 

(8) Since asynchronous inputs are synchronized internally by the 80960CA they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular rising 
edge of PCLK2:1 the setup times shown must be met. Asynchronous Inputs must be active for at least two consecutive 
PCLK2:1 rising edges to be seen by the processor. 

(9) These specifications are guaranteed by the processor. 

(10) These specifications must be met by the system for proper operation of the processor. 

(11) This timing is dependent upon the loading of PCLK2:1. Use the derating curves of Section 4.5.3 to adjust the timing for 
PCLK2:1 loading. 

(12) In the One-x input clock mode the maximum input clock period Is limited to 125 ns while the processor is operating. 
When the processor is in reset, the input clock may stop even In One-x mode. 

(1 3) When in the One-x input clock mode, these specifications assume a stable input clock with a period variation of less 
than ±0.1% between adjacent cycles. 

(1 4) This parameter is not tested. 

(1 5) Since asynchronous inputs are synchronized internally by the 80960CA, they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular 
falling edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecu- 
tive PCLK2:1 falling edges to be seen by the processor. 
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AC Characteristics — 80960CA-25 


(80960CA-25 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

INPUT CLOCKOO) 

Tf 

CLKIN Frequency 

0 

50 

MHz 

(1) 

Tc 

CLKIN Period 

In One-X Mode (fni kiv) 

40 

125 

ns 

(1,12) 



In Two-X Mode (fcLK2x) 

20 

00 

ns 

(1) 

TCS 

CLKIN Period Stability 

In One-X Mode (fcLKix) 


±0.1% 

A 

(1,13) 

Tch 

CLKIN High Time 

In One-X Mode (fm Kiy) 

8 

62.5 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

8 

00 

ns 

(1) 

Tcl 

CLKIN Low Time 

In One-X Mode (fni kiy) 

8 

62.5 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

8 

00 

ns 

(1) 

TcR 

CLKIN Rise Time 

0 

6 

ns 

(1) 

Tcf 

CLKIN Fall Time 

0 

6 

ns 

(1) 

OUTPUT CLOCKS(9) 

TcP 

CLKIN to PCLK2:1 Delay 

In One-X Mode (fni kiv) 

-2 

2 

ns 

(1,3,13,14) 



In Two-X Mode (fcLK2x) 

2 

25 

ns 

(1,3) 

T 

PCLK2:1 Period 

In One-X Mode (fni kiy) 

Tc 

ns 

(1,13) 



In Two-X Mode (fcLK2x) 

2Tc 

ns 

(1,3) 

TpH 

PCLK2:1 High Time 

CO 

1 

T/2 

ns 

(1,13) 

TpL 

PCLK2:1 Low Time 

(T/2) - 3 

T/2 

ns 

(1.13) 

TpR 

PCLK2:1 Rise Time 

1 

4 

ns 

(1.3) 

Tpp 

PCLK2:1 Fall Time 

1 

4 

ns 

(1,3) 

SYNCHRONOUS OUTPUTS(IO) 

TOV 

Output Valid Delay, Output Hold 




(6,11) 

Tqh 

T 0 VI.T 0 H 1 

A31:2 

3 

16 

ns 



Tov2. Toh2 

BE3:0 

3 

18 

ns 



ToV3. ToH3 

ADS 

6 

20 

ns 



ToV4. ToH4 

W/R 

3 

20 

ns 



Tov5. Tons 

D/C,SUP.DMA 

4 

18 

ns 



ToV6. ToH6 

BLAST, WAIT 

5 

18 

ns 



ToV7. ToH7 

DEN 

3 

18 

ns 



ToV8. ToH8 

HOLDA, BREQ 

4 

18 

ns 



ToV9. ToH9 

LOCK 

4 

18 

ns 



T 0 V 10 . Tqhio 

DACK3;0, EOP3:0/TC3:0 

4 

20 

ns 



TOVII.T 0 HII 

D31:0 

3 

18 

ns 



T 0 VI 2 . T 0 HI 2 

DT/R 

T/2 + 3 

T/2 + 16 

ns 



TOVI 3 . T 0 HI 3 

FAIL 

2 

16 

ns 

(6,11) 

Top 

Output Float for all outputs 

3 

22 

ns 

(6) 

SYNCHRONOUS INPUTS(IO) 

Tis 

Input Setup 

Tisi 

D31:0 

5 


ns 

(1.11) 


T|S2 

BOFF 

19 


ns 

(1,11) 


TiS3 

BTERM/READY 

9 


ns 

(1.11) 


T|S4 

HOLD 

9 


ns 

(1,11) 

T|h 

Input Hold 

T|hi 

D31:0 

5 


ns 

(1.11) 


T|H2 

BOFF 

7 


ns 

(1.11) 


T|H3 

BTERM/READY 

2 


ns 

(1.11) 


T|H4 

HOLD 

5 


ns 

(1.11) 
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AC Characteristics •— 80960CA-25 

(80960CA-25 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

RELATIVE OUTPUT TIMINGS(9>7) 

Tavshi 

A31 :2 Valid to ADS Rising 

T - 4 

T + 4 

ns 


TaVSH2 

BE3:0, W/R, SUP, D/C, 

DMA, DACK3:0 Valid to ADS Rising 

T - 6 

T + 6 

' ns 


Taveli 

A31:2 Valid to DEN Falling 

T - 4 

T + 4 

ns 


Tavel2 

BE3:0, W/R, SUP, INST, 

DMA, DACK3:0 Valid to DEN Falling 

T - 6 

T + 6 

ns 


Tnlqv 

WAIT Falling to Output Data Valid 

±4 

ns 



Output Data Valid to WAIT Rising 

N*T - 4 

N*T + 4 

ns 

(4) 


WAIT Falling to WAIT Rising 

N*T ± 4 

ns 

(4) 


Output Data Hold after WAIT Rising 

(N + 1) =^T - 4 

(N + 1)*T + 4 

ns 


IBSSSBI 

DT/R Hold after DEN High 

T/2 - 4 

oo 


■Bl 


DT/R Valid to DEN Falling 

T/2 - 4 

T/2 + 4 

ns 

mm 

RELATIVE INPUT TIMINGSm 

T|S5 

RESET Input Setup 

8 


ns 



RESET Input Hold 

7 


ns 

(15) 

T|S6 

DREQ3:0 Input Setup 

14 


ns 

(8) 

T|H6 

DREQ3:0 Input Hold 

9 


ns 

(8) 

T|S7 

XINT7:0, NMI Input Setup 

9 


ns 

(15) 

T|H7 

XINT7:0, NMI Input Hold 

5 


ns 

(15) 


NOTES: 

(1) See Section 4.5.2, AC Timing Waveforms for waveforms and definitions. 

(2) See Figure 22 for capacitive derating information for output delays and hold times. 

(3) See Figure 23 for capacitive derating information for rise and fall times. 

(4) Where N is the number of Nrad. Nrdd. Nwad. or Nw dd wait states that are programmed in the Bus Controller Region 
Table. When there are no wait states in an a ccess, W AIT never goes active. 

(5) N = Number of wait sta^s inserted with READY. 

(6) Output Data and/or DT/R may be driven indefinitely following a cycle if there is no subsequent bus activity. 

(7) See Notes 1 , 2 and 3. 

(8) Since asynchronous inputs are synchronized internally by the 80960CA they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular rising 
edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecutive 
PCLK2:1 rising edges to be seen by the processor. 

(9) These specifications are guaranteed by the processor. 

(10) These specifications must be met by the system for proper operation of the processor. 

(11) This timing is dependent upon the loading of PCLK2:1. Use the derating curves of Section 4.5.3 to adjust the timing for 
PCLK2:1 loading. 

(12) In the One-x input clock mode the maximum input clock period is limited to 125 ns while the processor is operating. 
When the processor is in reset, the input clock may stop even In One-x mode. 

(13) When in the One-x input clock mode, these specifications assume a stable input clock with a period variation of less 
than ±0.1% between adjacent cycles. 

(14) This parameter is not tested. 

(15) Since asynchronous inputs are synchronized internally by the 80960CA, they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular 
falling edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs rhust be active for at least two consecu- 
tive PCLK2:1 falling edges to be seen by the processor. 
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AC Characteristics — 80960CA-16 


(80960CA-16 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

INPUT CLOCK(H>) 

Tf 

CLKIN Frequency 

0 

32 

MHz 

(1) 

Tc 

CLKIN Period 

In One-X Mode (fcLKix) 

62.5 

125 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

31.25 

00 

ns 

(1) 

Tcs 

CLKIN Period Stability 

In One-X Mode (fcLKix) 


±0.1% 

A 

(1.13) 

Tch 

CLKIN High Time 

In One-X Mode (fcLKix) 

10 

62.5 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

10 

00 

ns 

(1) 

Tcl 

CLKIN Low Time 

In One-X Mode (fcLKix) 

10 

62.5 

ns 

(1.12) 



In Two-X Mode (fcLK2x) 

10 

00 

ns 

(1) 

Tcr 

CLKIN Rise Time 

0 

6 

ns 

(1) 

Tcf 

CLKIN Fall Time 

0 

6 

ns 

(1) 

OUTPUT CLOCKS(9) 

TcP 

CLKIN to PCLK2:1 Delay 

In One-X Mode (fcLKix) 

-2 

2 

ns 

(1,3,13,14) 



In Two-X Mode (fcLK2x) 

2 

25 

ns 

(1,3) 

T 

PCLK2:1 Period 

In One-X Mode (fm kiy) 

Tc 

ns 

(1.13) 



In Two-X Mode (fcLK2x) 

2Tc 

ns 

(1.3) 

TpH 

PCLK2:1 High Time 

(T/2) - 4 

T/2 

ns 

(1,13) 

TPL 

PCLK2:1 Low Time 

(T/2) - 4 

T/2 

ns 

(1.13) 

TpR 

PCLK2:1 Rise Time 

1 

4 

ns 

(1.3) 

TpF 

PCLK2:1 Fall Time 

1 

4 

ns 

(1.3) 

SYNCHRONOUS OUTPUTS(IO) 

TOV 

Output Valid Delay, Output Hold 




(6.11) 

Tqh 

T 0 VI.T 0 HI 

A31:2 

3 

18 

ns 



Tov2. Toh2 

BE3:0 

3 

20 

ns 



ToV3.ToH3 

ADS 

6 

22 

ns 



ToV4. ToH4 

W/R 

3 

22 

ns 



ToV5.ToH5 

D/C, SUP, DMA 

4 

20 

ns 



ToV6. ToH6 

BLAST, WAIT 

5 

20 

ns 



ToV7. ToH7 

DEN 

3 

20 

ns 



ToV8. ToH8 

HOLDA, BREQ 

4 

20 

ns 



ToV9. ToH9 

LOCK 

4 

20 

ns 



TOVIO. T 0 HIO 

DACK3:0, EOP3:0/TC3:0 

4 

22 

ns 



T 0 VII.T 0 HII 

D31:0 

3 

20 

ns 



T 0 VI 2 . T 0 HI 2 

DT/R 

T/2 + 3 

T/2 + 18 

ns 



T 0 VI 3 . T 0 HI 3 

FAIL 

2 

18 

ns 

(6,11) 

Tof 

Output Float for all outputs 

3 

22 

ns 

(6) 

SYNCHRONOUS INPUTSdO) 

Tis 

Input Setup 

Tisi 

D31:0 

5 


ns 

(1.11) 


T|S2 

BOFF 

21 


ns 

(1.11) 


T|S3 

BTERM/READY 

9 


ns 

(1.11) 


T|S4 

HOLD 

9 


ns 

(1.11) 

T|h 

Input Hold 

T|H1 

D31:0 

5 


ns 

(1.11) 


T|H2 

BOFF 

7 


ns 

(1.11) 


T|H3 

BTERM/READY 

2 


ns 

(1,11) 


T|H4 

HOLD 

5 


ns 

(1.11) 
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AC Characteristics — 80960CA-16 

(80960CA-16 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

RELATIVE OUTPUT TIMINGS^.?) 

Tavshi 

A31 :2 Valid to ADS Rising 

T - 4 

T + 4 

ns 


TaVSH2 

BE3:0, W/R, SUP, D/C, 

DMA, DACK3:0 Valid to ADS Rising 

T - 6 

T + 6 

ns 


Taveli 

A3 1:2 Valid to DEN Falling 

T - 6 

T + 6 

ns 


TaVEL2 

BE3;0, W/R, SUP, INST, 

DMA, DACK3:0 Valid to DEN Falling 

T - 6 

T + 6 

ns 


Tnlqv 

WAIT Falling to Output Data Valid 

±4 

ns 


TdVNH 

Output Data Valid to WAIT Rising 

N*T - 4 

N*T + 4 

ns 

(4) 

Tnlnh 

WAIT Falling to WAIT Rising 

N*T ± 4 

ns 

(4) 

TnHQX 

Output Data Hold after WAIT Rising 

(N + 1)*T - 4 

(N + 1) *T + 4 

ns 

(5) 

Tehtv 

DT/R Hold after DEN High 

T/2 - 4 

oo 

ns 

(6) 

TjVEL 

DT/R Valid to DEN Falling 

T/2 - 4 

T/2 + 4 

ns 

(7) 

RELATIVE INPUT TIMINGS(7) 

T|S5 

RESET Input Setup 

10 


ns 

(15) 

TiH5 

RESET Input Hold 

9 


ns 

(15) 

T|S6 

DREQ3:0 Input Setup 

16 


ns 

(8) 

T|H6 

DREQ3;0 Input Hold 

11 


ns 

(8) 

T|S7 

XINT7:0, NMI Input Setup 

9 


ns 

(15) 

T|H7 

XINT7:0, NMI Input Hold 

5 


ns 

(15) 


NOTES: 

(1 ) See Section 4.5.2, AC Timing Waveforms for waveforms and definitions. 

(2) See Figure 22 for capacitive derating information for output delays and hold times. 

(3) See Figure 23 for capacitive derating information for rise and fall times. 

(4) Where N is the number of Nrad. Nrdd. Nwad. or Nw dd wait states that are programmed in the Bus Controller Region 
Table. When there are no wait states in an access, WAIT never goes active. 

(5) N = Number of wait state inserted with READY. 

(6) Output Data and/or DT/R may be driven indefinitely following a cycle if there is no subsequent bus activity. 

(7) See Notes 1 , 2 and 3. 

(8) Since asynchronous inputs are synchronized internally by the 80960CA they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular rising 
edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecutive 
PCLK2:1 rising edges to be seen by the processor. 

(9) These specifications are guaranteed by the processor. 

(10) These specifications must be met by the system for proper operation of the processor. 

(11) This timing is dependent upon the loading of PCLK2:1. Use the derating curves of Figure 22 to adjust the timing for 
PCLK2:1 loading. 

(12) In the One-x input clock mode the maximum input clock period is limited to 125 ns while the processor is operating. 
When the processor is in reset, the input clock may stop even in One-x mode. 

(13) When in the One-x input clock mode, these specifications assume a stable input clock with a period variation of less 
than ±0.1% between adjacent cycles. 

(14) This parameter is not tested. 

(1 5) Since asynchronous inputs are synchronized internally by the 80960CA, they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular 
falling edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecu- 
tive PCLK2:1 falling edges to be seen by the processor. 
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The AC Specifications in Section 4.5 are tested with 
the 50 pf load shown in Figure 1 5. See Figure 22 to 
see how timings vary with load capacitance. 

Specifications are measured at the 1.5V crossing 
point, unless otherwise indicated. Input waveforms 
are assumed to have a rise-and-fall time of ^ 2 ns 
from 0.8V to 2.0V. See Section 4.5.2, AC Timing 
Waveforms, for AC spec definitions, test points, 
and illustrations. 


4.5.2. AC TIMING WAVEFORMS 
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— OUTPUT DELAY — The maximum output delay is referred to 
as the Output Valid Delay (Tq^ ). The minimum output delay is 
referred to as the Output Hold (Tqj^). 


© 


— OUTPUT FLOAT DELAY — The output float condition occurs 
when the maximum output current becomes less than 'lo in magnitude. 


@@ 


— INPUT SETUP AND HOLD — The input setup and hold requirements 
specify the sampling window during which synchronous inputs must be 
stable for correct processor operation. 


270727-64 
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Figure 22. Output Delay or Hold vs Load Capacitance 
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Figure 23. Rise and Fail Time Derating at Highest Operating Temperature and Minimum Vcc 
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5.0 RESET, BACKOFF AND HOLD 
ACKNOWLEDGE 

The following table list s the co ndition of each proc- 
essor output pin while RESET is asserted (low). 


Table 13. Reset Conditions 


Pins 

state During Reset 
(HOLDA inactive)i 

A31:A2 

Floating 

D31:D0 

Floating 

BE3:0 

Driven high (Inactive) 

W/R 

Driven low (Read) 

ADS 

Driven high (Inactive) 

WAIT 

Driven high (Inactive) 

BLAST 

Driven low (Active) 

DT/R 

Driven low (Receive) 

DEN 

Driven high (Inactive) 

LOCK 

Driven high (Inactive) 

BREQ 

Driven low (Inactive) 

D/C 

Floating 

Di^ 

Floating 


Floating 

FAIL 

Driven low (Active) 

DACK3 

Driven high (Inactive) 

DACK2 

Driven high (Inactive) 

DACK1 

Driven high (Inactive) 

DACKO 

Driven high (Inactive) 

EOP/ra 

Floating (set to input mode) 

EOP/TC2 

Floating (set to input mode) 

EOP/TCT 

Floating (set to input mode) 

eot/Tco 

Floating (set to input mode) 


The following table lists the condition of each proc- 
essor output pin while HOLDA is asserted (low). 


Table 14. Hold Acknowledge 
and Backoff Conditions 


Pins 

State During HOLDA 

A31:A2 

Floating 

D31:D0 

Floating 

BE3:0 

Floating 

W/R 

Floating 

ADS 

Floating 

WAIT 

Floating 

BLAST 

Floating 

DT/R 

Floating 

DEN 

Floating 

LOCK 

Floating 

BREQ 

Driven (high or low) 

D/C 

Floating 

DK^ 

Floating 

SUP 

Floating 

fm: 

Driven high (Inactive) 

DACK3 

Driven high (Inactive) 

DACK2 i 

Driven high (Inactive) 

DACK1 

Driven high (Inactive) 

DACKO 

Driven high (Inactive) 

EOP/TC3 

Driven if output 

EOP/T^ 

Driven If output 

EOP/TCT 

Driven if output . 

EOP/TCO 

Driven if output 


NOTE: 

(1) With regard to bus output pin state only, the Hold Ac- 
knowledge state takes preced ence over the reset state. Al- 
though asserting the RESET pin will internally reset the 
processor, the processor’s bus output pins will not enter 
the reset state if it has granted Hold Acknowledge to a pre- 
vious HOLD request (HOLDA is active). Furthermore, the 
processor will grant new HOLD requests and enter the 
Hold Acknowledge state even while in reset. 

For example, if HOLDA is not active and the processor is 
In the reset state, then HOLD is asserted, the processor’s 
bus pins will enter the Hold Acknowledge state and 
HOLDA will be granted. The processor will not be able to 
perform memory a ccesses until the HOLD request is re- 
moved, even if the RESET pin is brought high. This opera- 
tion is provided to simplify boot-up synchronization among 
multiple processors sharing the same bus. 
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Figure 25. Cold Reset Waveform 
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6.0 BUS WAVEFORMS 








Figure 26. Warm Reset Waveform 
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Figure 27. Entering the ONCET^ State 

3-208 


CLKIN may noi (loal 


CLKIN 


V 


cc 


PCLK1.2 


ADS. 5503. A2 32, D O 31 . 
LOCK. WAIT, blast W/R. 
D/C DEN DT /R. HO L D. H OLDA 
BLAST. FAIL. SUP 
BR EO. BKR. EgP/~ f C03 
S TEST. XINTO 7. NMI . 
DA CKO 3. DREOO 3. 
AEaOy. Bt^ftW 


ONCE 


II must be driven high or low or continue to run 
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NOTE: 

Case 1 and Case 2 show two possible polarities of PCLK2:1. 


Figure 28. Clock Synchronization in the 2x Clock Mode 
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Figure 29. Non-Burst, Non-Pipelined Accesses without wait states 
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Figure 31. Non-Burst, Non-Pipelined Write with wait states 
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Region Table Entry 


"V 

Reserved 

/ \ 

Byte 

Order 

Reserved 

Bus 

Width 

Nwdd 

Nwad 

Nxda 

/ 

Nrdd 

Nrad 

Pipe- 

lining 

External 

Ready 

Control 

Burst 

bits 31 -23 

bit 22 

bit 21 

bits 20-19 

bits 18-17 

bits 16-12 

bits 11-10 

bits 9-8 

bits 7-3 

bit 2 

bit 1 

bitO 

0 

. 0...0 . 

X 

0 

1 0 , 

32-bit 

X 

X 

, xxxxx ^ 

0 

, oo , 

0 

. 00 ^ 

0 

,00000, 

O § 

Disabled 

. 0 , 

Enabled 

. 1 


[ 

A31:4, gPP. r 
DMA, D/C, 

LQCK,BE3^ L 

[ 


•”[ 


D : D 


\ 


/ 


I 


I 


I 


□ 






/ 








□ 





A_ 

1 

r 











j 



00 






/ , 


\ 










/— 

-1 /— 
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Figure 32. Burst, Non-Pipelined Read without wait states, 32-bit bus 
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Figure 34. Burst, Non-Pipelined Write without wait states, 32-bit bus 
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Figure 36. Burst, Non-Pipelined Read with wait states, 16-bit bus 
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Figure 37. Burst, Non-Pipelined Read with wait states, 8-bit bus 
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Region Table Entry 


r 
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bit 22 

b(l21 i 

bits 20-19 

bits 18-17 

bits 16 12 

bus 11-10 

bits 9 8 

bits 7-3 

bit 2 
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bito 

0 

X 
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0 
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X 

Disabled 
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Figure 38. Non-Burst, Pipelined Read without wait states, 32-bit bus 
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Figure 43. Burst, Pipelined Read with wait states, 8-bit bus 
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Figure 44. Using External READY 
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Quad-Word Read 
Nrad = 0. Nrdd = 0. Nxda = 0 
Ready Enabled 


-i 


"O" f - 


READY adds memory access time to data transfers, whether or not the bus access is a bu rst acce ss. BTERM i nterrupts 
a bus access, whether or not the bus access has more data transfers pending. Either the READY signal or the BTERM 
signal will terminate a bus access if the signal is asserted during the last (or only) data transfer of the bus access. 

Figure 45. Terminating a Burst with BTERM 
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• End of DMA 
bus request 
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NOTES: 

1. Case 1: DREQ must deassert before DACK deasserts. Applications are Fly-by and some packing and unpacking 
modes, adj acent load-stores or store-loads, loads followed by loads, and stor es follo wed by stores. 

2. Case 2; DREQ must be deasserted by the second clock (rising edge) after DACK Is driven high. Applications are non 
fly -by tra nsfers and adjacent load-stores or store-loads. 

3. DACK x is asserte d for the duration of a DMA bus request. The request may consist of multiple bus accesses (defined 
by ADS and BLAST. Refer to User’s Manual for “access”, “request” definition. 


Figure 48. DREQ and DACK Functional Timing 
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Figure 51. FAIL Functional Timing 
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BYTE OFFSET 0 


WORD OFFSET 0 


0m SHORT REQUEST (ALIGNED) 


SHORT-WORD 

LOAD/STORE 


BrrE, BYTE REQUESTS 


:] SHORT REQUEST (ALIGNED) 



■0::] BYTE, BYTE REQUESTS 


WORD REQUEST (ALIGNED) 


WORD 

LOAD/STORE 


SHORT, BYTE, BYTE REQUESTS 


SHORT, SHORT REQUESTS 


BYTE, SHORT, BYTE REQUESTS 

' I I 


ONE DOUBLE-WORD REQUEST (ALIGNED) 

^ I I 

: ] BYTE, SHORT, WORD. BYTE REQUESTS 


DOUBLE-WORD 

LOAD/STORE 


SHORT, WORD, SHORT REQUESTS 


BYTE, WORD, SHORT. BYTE REQUESTS 


WORD, WORD REQUESTS 


ONE DOUBLE-WORD REQUEST (ALIGNED) 


Figure 52. A Summary of Aligned and Unaligned Transfers for Little Endian Regions 
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BYTE OFFSET 0 4 8 12 16 20 24 


WORD OFFSET 0 12 3 4 5 6 


TRIPLE-WORD ) 
LOAD/STORE \ 


QUAD-WORD } 
LOAD/STORE \ 
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Figure 53. A Summary of Aligned and Unaligned Transfers for Little Endian Regions (Continued) 
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Figure 54. Idle Bus Operation 
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i960TM MC PROCESSOR 
PRODUCT OVERVIEW 



This chapter provides an overview of the architecture 
of the i960 MC processor. 

The i960 MC processor is the military-grade member of 
a new family of processors from Intel. This processor 
family is based on a new 32-bit architecture called the 
i960 architecture. The i960 architecture has been de- 
signed specifically to meet the needs of embedded appli- 
cations such as avionics, aerospace, weapons systems, 
robotics and instrumentation, where high reliability is 
critical. It represents a renewed commitment from Intel 
to provide reliable, high-performance processors and 
controllers for the embedded processor marketplace. 

The i960 architecture can best be characterized as a 
high-performance computing engine. It features high- 
speed instrumentation execution and ease of program- 
ming. It is also easily extensible, allowing processors 
and controllers based on this architecture to be conve- 
niently customized to meet the needs of specific pro- 
cessing and control appplications. 

Some of the important attributes of the i960 architec- 
ture include: 

• full 32-bit registers 

• high-speed, pipelined instruction execution 

• a convenient program execution environment with 
32 general-purpose registers and a versatile set of 
special-function registers 

• a highly optimized procedure call mechanism that 
features on-chip caching of local variables and pa- 
rameters 

• extensive facilities for handling interrupts and faults 

• extensive tracing facilities to support efficient pro- 
gram debugging and monitoring 

• register scoreboarding and write buffering to permit 
efficient operation with lower performance memory 
subsystems 

The i960 MC processor implements the i960 architec- 
ture, plus it offers several extensions to the architecture. 
Some of these extensions, such as on-chip support for 
floating-point arithmetic, virtual memory management 
and multitasking, are designed to enhance overall sys- 
tem performance. Several other extensions are designed 
to enhance system reliability and robustness. These ex- 
tensions include facilities for hardware enforced protec- 
tion of software modules and for creating fault tolerant 
systems through the use of redundant processors. 


The following sections describe those features of the 
i960 architecture that are provided to streamline code 
execution and simplify programming. The extensions to 
this architecture provided in the i960 MC processor are 
described at the end of the chapter. 


HIGH PERFORMANCE PROGRAM 
EXECUTION 

Much of the design of the i960 architecture has been 
aimed at maximizing the processor’s computational 
and data processing speed through increased parallel- 
ism. The following paragraphs describe several of the 
mechanisms and techniques used to accomplish this 
goal, including: 

• an efficient load and store memory-access model 

• caching of code and procedural data 

• overlapped execution of instructions 

• many one or two clock-cycle instructions 


Load and Store Model 

One of the more important features of the i960 archi- 
tecture is that most of its operations are performed on 
operands in registers, rather than in memory. For ex- 
ample, all the arithmetic, logical, comparison, branch- 
ing and bit operations are performed with registers and 
literals. 

This feature provides two benefits. First, it increases 
program execution speed by minimizing the number of 
memory accesses required to execute a program. Sec- 
ond, it reduces memory latency encountered when us- 
ing slower, lower-cost memory parts. 

To support this concept, the architecture provides a 
generous supply of general-purpose registers. For each 
procedure, 32 registers are available (28 of which are 
available for general use). These registers are divided 
into two types: global and local. Both these types of 
registers can be used for general storage of operands. 
The only difference is that global registers retain their 
contents across procedure boundaries, whereas the 
processor allocates a new set of local registers each time 
a new procedure is called. 
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The architecture also provides a set of fast, versatile 
load and store instructions. These instructions allow 
burst transfers of 1, 2, 4, 8, 12 or 16 bytes of informa- 
tion between memory and the registers. 

On-Chip Caching of Code and Data 

To further reduce memory accesses, the architecture 
offers two mechanisms for caching code and data on 
chip: an instruction cache and multiple sets of local 
registers. The instruction cache allows prefetching of 
blocks of instruction from memory, which helps insure 
that the instruction execution pipeline is supplied with 
a steady stream of instructions. It also reduces the 
number of memory accesses required when performing 
iterative operations such as loops. (The size of the in- 
struction cache can vary. With the i960 MC processor, 
it is 512 bytes.) 

To optimize the architecture’s procedure call mecha- 
nism, the processor provides multiple sets of local regis- 
ters. This allows the processor to perform most proce- 
dure calls without having to write the local registers out 
to the stack in memory. 

(The number of local-register sets provided depends on 
the processor implementation. The i960 MC processor 
provides four sets of local registers.) 

Overlapped Instruction Execution 

Another technique that the i960 architecture employs 
to enhance program execution speed is overlapping the 
execution of some instructions. This is accomplished 
through two mechanisms: register scoreboarding and 
branch prediction. 

Register scorebo^rding permits instruction execution to 
continue while data is being fetched from memory. 
When a load instruction is executed, the processor sets 
one or more scoreboard bits to indicate the target regis- 
ters to be loaded. After the target registers are loaded, 
the scoreboard bits are cleared. While the target regis- 
ters are being loaded, the processor is allowed to exe- 
cute other instructions that do not use these registers. 
The processor uses the scoreboard bits to insure that 
target registers are not used until the loads are com- 
plete. (The checking of scoreboard bits is transparent to 
software.) The net result of using this technique is that 
code can often be optimized in such a way as to allow 
some instructions to be executed parallel. 


Siiigle-Clock Instructions 

It is the intent of the i960 architecture that a processor 
be able to execute commonly used instructions such as 
move, add, subtract, logical operations, compare and 
branch in a minimum number of clock cycles (prefer- 
ably one clock cycle). The architecture supports this 
concept in several ways. For example, the load and 
store model described earlier in this chapter (with its 
concentration on register-to-register operations) allows 
simple operations to be performed without the over- 
head of memory-to-memory operations. 

Also, all the instructions in the i960 architecture are 
32 bits or 64 bits long and aligned on 32-bit boundaries. 
This feature allows instructions to be decoded in one 
clock cycle. It also eliminates the need for an instruc- 
tion-alignment stage in the pipeline. 

The design of the i960 MC processor takes full advan- 
tage of these features of the architecture, resulting in 
more than 50 instructions that can be executed in a 
single clock-cycle. 


Efficient Interrupt Model 

The i960 architecture provides an efficient mechanism 
for servicing interrupts from external sources. To han- 
dle interrupts, the processor maintains an interrupt ta- 
ble of 248 interrupt vectors (240 of which are available 
for general use). When an interrupt is signaled, the 
processor uses a pointer from the interrupt table to per- 
form an implicit call to an interrupt handler procedure. 
In performing this call, the processor automatically 
saves the state of the processor prior to receiving the 
interrupt; performs the interrupt routine; and then re- 
stores the state of the processor. A separate interrupt 
stack is also provided to segregate interrupt handling 
from application programs. 

The interrupt handling facilites also feature a method 
of prioritizing interrupts. Using this technique, the 
processor is able to store interrupts that are lower in 
priority than the task the processor is currently work- 
ing on in a pending interrupt section of the interrupt 
table. At certain defined times, the processor checks the 
pending interrupts and services them. 
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SIMPLIFIED PROGRAMMING 
ENVIRONMENT 

Partly as a side benefit of its streamlined execution en- 
vironment and partly by design, processors based on 
the i960 architecture are particularly easy to program. 
For example, the large number of general-purpose reg- 
isters allows relatively complex algorithms to be execut- 
ed with a minimum number of memory accesses. The 
following paragraphs describe some of the other fea- 
tures that simplify programming. 


Highly Efficient Procedure Call 
Mechanism 

The procedure call mechanism makes procedure calls 
and parameter passing between procedures simple and 
compact. Each time a call instruction is issued, the 
processor automatically saves the current set of local 
registers and allocates a new set of local registers for' 
the called procedure. Likewise, on a return from a pro- 
cedure, the current set of local registers is deallocated 
and the local registers for the procedure being returned 
to are restored. On a procedure call, the program thus 
never has to explicitly save and restore those local vari- 
ables and parameters that are stored in local registers. 


Versatiie Instruction Set and 
Addressing 

The selection of instructions and addressing modes also 
simplifies programming. The architecture offers a full 
set of load, store, move, arithmetic, comparison and 
branch instructions, with operations on both integer 
and ordinal data types. It also provides a complete set 
of Boolean and bit-field instructions, to simplify opera- 
tions on bits and bit strings. 

The addressing modes are efficient and straightforward, 
while at the same time providing the necessary indexing 
and scaling modes required to address complex arrays 
and record structures. 

The large 4-gigabyte address space provides ample 
room to store programs and data. The availability of 32 
addressing lines allows some address lines to be memo- 
ry-mapped to control hardware functions. 


Extensive Fault Handling Capability 

To aid in program development, the i960 architecture 
defines a wide selection of faults that the processor de- 
tects, including arithmetic faults, invalid operands, in- 


valid operations and machine faults. When a fault is 
detected, the processor makes an implicit call to a fault 
handler routine, using a mechanism similar to that de- 
scribed above for interrupts. The information collected 
for each fault allows program developers to quickly 
correct faulting code. It also allows automatic recovery 
from some faults. 


Debugging and Monitoring 


To support debugging systems, the i960 architecture 
provides a mechanism for monitoring processor activity 
by means of trace events. The processor can be config- 
ured to detect as many as seven different trace events, 
including branches, calls, supervisor calls, returns, pre- 
returns, breakpoints and the execution of any instruc- 
tion. When the processor detects a trace event, it sig- 
nals a trace fault and calls a fault handler. Intel pro- 
vides several tools that use this feature, including an in- 
circuit emulator (ICEtm) device. 


SUPPORT FOR ARCHITECTURAL 
EXTENSIONS 



The i960 architecture described earlier in this chapter 
provides a high-performance computing engine for use 
as the computational and data-processing core of em- 
bedded processor or controllers. The architecture also 
provides several features that enable processors based 
on this architecture to be easily customized to meet the 
needs of specific embedded applications, such as signal 
processing, array processing or graphics processing. 


The most important of these features is a set of 32 spe- 
cial-function registers. These registers provide a conve- 
nient interface to circuitry in the processor or to pins 
that can be connected to external hardware. They can 
be used to control timers, to perform operations on spe- 
cial data types or to perform I/O functions. 

The special-function registers are similar to the global 
registers. They can be addressed by all the register-ac- 
cess instructions. 


EXTENSIONS INCLUDED IN THE 
80960MC PROCESSOR 

The extensions to the i960 architecture included in the 
i960 MC processor are built on top of the processor’s 
core computing engine. These extensions are aimed at 
improving the efficiency and reliability of embedded 
systems. 
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On-Chip Floating Point 

The i960 MC processor provides a complete implemen- 
tation of the IEEE standard for binary floating-point 
arithmetic (IEEE 754-185). This implementation in- 
cludes a full set of floating-point operations, including 
add, subtract, multiply, divide, trigonometric functions 
and logarithmic functions. These operations are per- 
formed on single precision (32-bit), double precision 
(64-bit) and extended precision (80-bit) real numbers. 

One of the benefits of this implementation is that the 
floating-point handling facilities are completely inte- 
grated into the normal instruction execution environ- 
ment. Single- and double-precision floating-point values 
are stored in the same registers as non-floating point 
values. Also, four 80-bit floating-point registers are pro- 
vided to hold extended-precision values. 


string and Decimal Operations 

The i960 MC processor provides several instructions 
for moving, filling and comparing byte strings in mem- 
ory. These instructions speed up string operations and 
reduce the amount of code required to handle strings. 

The decimal instructions perform move, add with carry 
and subtract with carry operations on binary-coded 
decimal (BCD) strings. 

Virtual-Memory Support 

Another of the i960 MC processor’s important features 
is support for virtual-memory management. When us- 
ing the processor in virtual-memory mode, the proces- 
sor provides each process (or task) with an address 
space of up to 2^2 bytes. This address space is paged 
into physical memory in 4 Kbyte pages. On-chip mem- 
ory-management facilities handle virtual-to-physical 
address translation. A translation look-aside buffer 
(TLB) speeds address translation by storing virtual-to- 
physical address translations for frequently accessed 
parts of memory, such as the location of the page tables 
and the location of often used system data structures. 


Protection 

The i960 MC processor offers two mechanisms for pro- 
tecting critical data structures or software modules. 
The first is the ability to use page rights bits to restrict 
access to individual pages. Page rights allow various 
levels of access to be assigned to a page, ranging from 
no access to read only to read-write. 

The second protection mechanism is a user/supervisor 
protection model. This two-level protection model pro- 
vides hardware enforced protection of kernel proce- 
dures and data structures. When using this protection 
mechanism, priviledged procedures and data are placed 
in protected pages of memory. These pages can then be 
accessed only through a procedure table, which pro- 
vides a tightly controlled interface to kernel functions. 


Multitasking 

The i960 MC processor offers a variety of process man- 
agement facilities to support concurrent execution of 
multiple tasks. These facilities can be divided into two 
groups: process scheduling and interprocess communi- 
cations. , 

The process scheduling facilities consist of a set of gen- 
eral-purpose data structures and instructions, which are 
designed to support several different multitasking 
schemes. For example, the processor provides a set of 
instructions that allow the kernel to explicitly dispatch 
a task (bind it to the processor) and to suspend a task 
(save the current state of a task so that another task can 
be bound to the processor). These instructions can be 
used within kernel procedures to schedule, dispatch 
and preempt multiple tasks. 

The processor also provides a unique feature called self 
dispatching. Here, the kernel schedules tasks by queu- 
ing them to a dispatch port. Thereafter, the processor 
handles the dispatching, preempting and rescheduling 
of the tasks automatically, independent of the kernel. 
When using this mechanism, tasks can be scheduled by 
priority, with up to 32 priority levels to choose from. 
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The processor’s interprocess communication facilities 
include support for semaphores and communication 
ports. These facilities allow synchronization of interde- 
pendent tasks and asynchronous communication be- 
tween tasks. 


Multiprocessing 

The i960 MC processor provides several mechanisms 
designed to simplify the design of multiple-processor 
systems, allowing several processors to run in parallel, 
using shared memory resources. One of these mecha- 
nisms is the self-dispatching capability described above. 
Here, two or more processors can schedule and dis- 
patch processes from a single dispatch port, with each 
processor equally sharing the processing load. 

The processor also provides an inter-agent communica- 
tion (lAC) mechanism that allows processors to ex- 
change messages among themselves on the bus. This 
mechanism operates similarly to the interrupt mecha- 
nism, except that lAC messages are passed through 
dedicated sections of memory. The lAC mechanism 
can be used to preempt processes running on another 
processor, to manage interrupt handling or to initialize 
and synchronize several processors. 

A set of atomic instructions are also provided to syn- 
chronize memory accesses. Multiple processors can 
then access shared memory without inserting inaccura- 
cies and ambiguities into shared data structures. 


Fault Tolerance 

The i960 family of components supports fault-tolerant 
system design through the use of the M82965 Bus Ex- 
tension Unit component. The M82965 allows two proc- 
essors to be operated in tandem to form a self-checking 
module. The two M82965s check the outputs of two 
processors (a master and a checker) cycle-by-cycle. If 
the checking M82965 detects a difference between out- 
puts, it signals an error. A software recovery procedure 
can then be initiated. 

This fault detection mechanism supports several fault 
detection and recovery techniques, including self heal- 
ing, and continuous-operation (non-stop) systems. 

LOOK FOR MORE IN THE FUTURE 

The i960 architecture offers exceptional performance, 
plus a wealth of useful features to help in the design of 
efficient and reliable embedded systems. But equally 
important, it offers lots of room to grow. The i960 MC 
processor provides average instruction processing rates 
of 7.5 million instructions per second (7.5 MIPS) at 
20 MHz clock rate and 10 MIPS at a 25 MHz clock 
rate(l). 

However, the i960 MC processor is only the beginning. 
With improvements in VLSI technology, future imple- 
mentations of the i960 architecture will offer even 
greater performance. They will also offer a variety of 
useful extensions to solve specific control and monitor- 
ing needs in the field of embedded applications. 


1. 1 MIP is equivalent to the performance of a Digital Equipment Corp. VAX 11/780. 
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80960MC 

EMBEDDED 32-BIT MICROPROCESSOR 
WITH INTEGRATED FLOATING-POINT UNIT 
AND MEMORY MANAGEMENT UNIT 

Military 


■ High-Performance Embedded 

Architecture 

— 25 MIPS Burst Execution at 25 MHz 

— 9.4 MIPS*" Sustained Execution at 
25 MHz 

■ On-Chip Floating-Point Unit 

— Supports IEEE 754 Floating-Point 
Standard 

— Full Transcendental Support 

— Four 80-Bit Registers 

— 5.2 Million Whetstones/Second at 
25 MHz 

■ 512-Byte On-Chip Instruction Cache 

— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

■ Multiple Register Sets 

— Sixteen Global 32-Bit Registers 

— Sixteen Local 32-Bit Registers 

— Four Local Register Sets Stored 
On-Chip (Sixteen 32-Bit Registers 
per Set) 

— • Register Scoreboarding 


■ On-Chip Memory Management Unit 

— 4 Gigabyte Virtual Address Space 
per Task 

— 4 Kbyte Pages with Supervisor/User 
Protection 

■ Built-In Interrupt Controller 

— 32 Priority Levels 

— 248 Vectors 

— Supports M8259A 

— 3.4 jLis Latency 

■ Easy to Use, High Bandwidth 32-Bit Bus 

— 66.7 MBytes/s Burst 

— Up to 16-Bytes Transferred per Burst 

■ Multitasking and Multiprocessor 
Support 

— Automatic Task Dispatching 
— Prioritized Task Queues 

■ Advanced Package Technology 

— 132 Lead Ceramic Pin Grid Array 
— 164 Lead Ceramic Quad Flatpack 

■ Military Temperature Range 
55°C to +125°C (Tc) 


The 80960MC is the enhanced military member of Intel’s new 32-blt microprocessor family, the 960 series, 
which is designed especially for embedded applications. It is based on the family’s high performance, com- 
mon core architecture, and includes a 512-byte instruction cache, a built-in Interrupt controller, an integrated 
floating-point unit and a memory management unit. The 80960MC has a large register set, multiple parallel 
execution units, and a high-bandwidth, burst bus. Using advanced RISC technology, this high performance 
processor can respond to interrupts In under 3.4 jtxs and Is capable of execution rates in excess of 9.4 million 
Instructions per second.* The 80960MC is well-suited for a wide range of military and other high reliability 
applications. Including avionics, airborne radar, navigation, and instrumentation. 


■^Relative to Digital Equipment Corporation’s VAX-1 1/780** at 1 MIPS 



Figure 1. The 80960MC’s Highly Parallel Microarchitecture 

**VAX-11Tm is a trademark of Digital Equipment Corporation. 


271080-1 


3-238 


February 1991 
Order Number: 271080-007 
















80960MC 




iny. 


THE 960 SERIES 

The 80960MC is the enhanced military member of a 
new family of 32-bit microprocessors from Intel 
known as the 960 Series. This series was especially 
designed to serve the needs of embedded applica- 
tions. The embedded market includes applications 
as diverse as industrial automation, avionics, image 
processing, graphics, robotics, telecommunications, 
and automobiles. These types of applications re- 
quire high integration, low power consumption, quick 
Interrupt response times, and high performance. 
Since time to market Is critical, embedded micro- 
processors need to be easy to use in both hardware 
and software designs. 


All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor in the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications In the embedded 
market. For example, future processors may include 
a DMA controller, a timer, or an A/D converter. 

The 80960MC includes an Integrated Floating Point 
Unit (FPU), a Memory Management Unit (MMU), 
multitasking support, and multiprocessor support. 
There are also two commercial members of the fam- 
ily: the 80960KB processor with integrated FPU and 
the 80960KA without floating-point. 
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1. Register g15 is reserved for stack management functions. 

2. Registers rO, r1 , and r2 are reserved for stack management functions. 


3 


Figure 2. Register Set 


3-239 



80960MC 




iny 


KEY PERFORMANCE FEATURES 

The 80960MC’s architecture is based on the most 
recent advances in RISC technology and is ground- 
ed in Intel’s long experience in designing embedded 
controllers. Many features contribute to the 
80960MC’s exceptional performance: 

1. Large Register Set. Having a large number of 
registers reduces the number of times that a proces- 
sor needs to access memory. Modern compilers can 
take advantage of this feature to optimize execution 
speed. For maximum flexibility, the 80960MC pro- 
vides thirty-two 32-bit registers (sixteen local and 
sixteen global) and four 80-bit floating-point global 
registers. (See Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions In most programs, 


so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of In- 
structions.) 

3. Load/Store Architecture. Like other processors 
based on RISC technology, the 80960MC has a 
Load/Store architecture, only the LOAD and STORE 
instructions reference memory; all other instructions 
operate on registers. This type of architecture simpli- 
fies instruction decoding and is used In combination 
with other techniques to increase parallelism. 
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Figure 3. Instruction Formats 
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Table 1. 80960MC Instruction Set 


Data Movement 

Arithmetic 

Floating 

Point 

Logical 

Load 

Store 

Move 

Load Address 

Load Physical 

Address 

Add 

Subtract 

Multiply 

Divide 

Remainder 

Modulo 

Shift 

Add 

Subtract 

Multiply 

Divide 

Remainder 

Scale 

Round 

Square Root 

Sine 

Cosine 

Tangent 

Arctangent 

Log 

Log Binary 

Log Natural 

Exponent 

Classify 

Copy Real 

Extended 

Compare 

And 

Not And 

And Not 

Or 

Exclusive Or 

Not Or 

Or Not 

Nor 

Exclusive Nor 

Not 

Nand 

Rotate 

Comparison 

Branch 

Bit and 

Bit Field 

String 

Compare 

Conditional 

Compare 

Compare and 

Increment 

Compare and 

Decrement 

Unconditional 

Branch 

Conditional Branch 
Compare and 

Branch 

Set Bit 

Clear Bit 

Not Bit 

Check Bit 

Alter Bit 

Scan for Bit 

Scan over Bit 

Extract 

Modify 

Move String 

Move Quick String 

Fill String 

Compare String 

Scan Byte for 

Equal 

Conversion 

Decimal 

Call/Return 

Process 

Management 

Convert Real to 

Integer 

Convert Integer to 

Real 

Move 

Add with Carry 

Subtract with Carry 

Call 

Call Extended 

Call System 

Return 

Branch and Link 

Schedule Process 

Saves Process 

Resume Process 

Load Process Time 

Modify Process 

Controls 

Wait 

Conditional Wait 

Signal 

Receive 

Conditional 

Receive 

Send 

Send Service 

Atomic Add 

Atomic Modify 

Fault 

Debug 

Miscellaneous 

Conditional Fault 
Synchronize Faults 

Modify Trace 

Controls 

Mark 

Force Mark 

Flush Local 

Registers 

Inspect Access 

Modify Arithmetic 
Controls 

Test Condition 

Code 
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4. Simple instruction Formats. All instructions in 
the 80960MC are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possi- 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960MC manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated instruc- 
tions can be executed while the conditional Instruc- 
tion Is pending. 

6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand In a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the Value for the next operation. 

7. Bandwidth Optimizations. The 80960MC gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the Instruction cache matches the maximum burst 
size for instruction fetches. The 80960MC automati- 
cally fetches four words in a burst and stores them 
directly In the cache. Due to the size of the cache 
and the fact that it is continually filled in anticipation 
of needed instructions in the program flow, the 
80960MC is exceptionally Insensitive to memory 
wait states. In fact, each wait state causes only a 
7% degradation In system perfomance. The benefit 
is that the 80960MC will deliver outstanding per- 
formance even with a low cost memory system. 

8. Cache Bypass, if there is a cache miss, the proc- 
essor fetches the needed Instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 


mOSTiCry SpscG sricS AdcirBSSinQ n/IccSBS 

The 80960MC allows each task (process) to ad- 
dress a logical memory space of up to 4 Gbytes. In 
turn, each task’s address space is divided into four 
1 -Gbyte regions and each region can be mapped to 
physical addresses by zero, one, or two levels of 
page tables. The region with the highest addresses 
(Region 3) is common to all tasks. 


In keeping with RISC design principles, the number 
of addressing modes has been kept to a minimum 
but includes all those necessary to ensure efficient 
execution of high-level languages such as Ada, C, 
and Fortran. Table 2 lists the memory addressing 
modes. 


Data Types 

The 80960MC recognizes the following data types: 

Numeric: 

• 8-, 16-, 32- and 64-bit ordinals 

• 8-, 16, 32- and 64-bit integers 

• 32-, 64- and 80-bit real numbers 

Non-Numeric: 

• Bit 

• Bit Field 

• Triple-Word (96 bits) 

• Quad-Word (128 bits) 


Large Register Set 

The programming environment of the 80960MC in- 
cludes a large number of registers. In fact, 36 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 

There are two types of general-purpose registers: 
local and global. The 20 global registers consist of 
sixteen 32-bit registers (GO through G15) and four 
80-bit registers (FPO through FP3). These registers 
perform the same function as the general-purpose 
registers provided in other popular microprocessors. 
The term global refers to the fact that these regis- 
ters retain their contents across procedure calls. 

The local registers, on the other hand, are proce^ 
dure specific. For each procedure call, the 80960MC 
allocates 16 local registers (RO through R15). Each 
local register is 32 bits wide. Any register can also 
be used for floating-point operations; the 80-bit float- 
ing-point registers are provided for extended preci- 
sion. 


Multiple Register Sets 

To further increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 
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Table 2. Memory Addressing Modes 


• 12-Bit Offset 

• 32-Bit Offset ^ 

• Register-Indirect 

• Register + 12-Bit Offset 

• Register + 32-Bit Offset 

• Register 4- (Index-Register x Scale-Factor) 

• Register x Scale Factor + 32-Bit Displacement 

• Register + (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor is 1 , 2, 4, 8 or 1 6 


Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 
oldest set of local registers in the register cache to a 


procedure stack in memory to make room for a new 
set of registers. Global register G15 is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global and floating-point registers are 
not exchanged on a procedure call, but retain their 
contents, making them available to all procedures 
for fast parameter passing. An illustration of the reg- 
ister cache is shown in Figure 4. 
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Instruction Cache 

To further reduce memory accesses, the 80960MC 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions in a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 

To load the Instruction cache, instructions are 
fetched in 16-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when it is 
needed. 

Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure’s return. 


Register Scoreboarding 

The Instruction decoder has been optimized In sev- 
eral ways. One of these optimizations is the ability to 
do instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example; 

LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated Instruction 
ADD R4, R5, R6 

In essence, the two unrelated Instructions between 
the LOAD and ADD instructions are executed for 


free (i.e., take no apparent time to execute) because 
they are executed while the register Is being loaded. 
Up to three LOAD instructions can be pending at 
one time with three corresponding scoreboard bits 
set. By exploiting this feature, system programmers 
and compilers have a useful tool for optimizing exe- 
cution speed. 


Memory Management and Protection 

The 80960MC will be especially useful for multitask- 
ing applications that require software protection and 
a very large address space. To ensure the highest 
level of performance possible, the memory manage- 
ment unit and translation look-aside buffer (TLB) are 
contained on-chip. 

The 80960MC supports a conventional form of de- 
mand-paged virtual memory in which the address 
space Is divided into 4 Kbyte pages. Studies have 
shown that a 4 Kbyte page is the optimum size for a 
broad range of applications. 


Each page table entry Includes a 2-bit page rights 
field that specifies whether the page is a no-access, 
read-only, or read-write page. This field is interpret- 
ed differently depending on whether the current task 
(process) is executing In user or supervisor mode, as 
shown below; 


Rights 

User 

00 

No Access 

01 

No Access 

10 

Read-Only 

11 

Read-Write 


Supervisor 

Read-Only 

Read-Write 

Read-Write 

Read-Write 


Floating-Point Arithmetic 

In the 80960MC, floating-point arithmetic has been 
made an integral part of the architecture. Having the 
floating-point unit Integrated on-chip provides two 
advantages. First, It improves the performance of 
the chip for floating-point applications, since no 
additional bus overhead is associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations is reduced because a 
separate coprocessor chip is not required. 

The 80960MC floating-point (real number) data 
types include single-precision (32-bit), double-preci- 
sion (64-bit), and extended precision (80-bit) float- 
ing-point numbers. Any register may be used to exe- 
cute floating-point operations. 
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The processor provides hardware support for both 
mandatory and recommended portions of IEEE 
Standard 754 for floating-point arithmetic, including 
all arithmetic, exponential, logarithmic, and other 
transcendental functions. Table 3 shows execution 
times for some representative instructions. 

Table 3. Sample Floating-Point Execution 


Times (ills) at 25 MHz 



32-Bit 

64-Bit 

Add 

0.4 

0.5 

Subtract 

0.4 

0.5 

Multiply 

0.7 

1.3 

Divide 

1.3 

2.9 

Square Root 

3.7 

3.9 

Arctangent 

10.1 

13.1 

Exponent 

11.3 

12.5 

Sine 

15.2 

16.6 

Cosine 

15.2 

16.6 


Multitasking Support 

Multitasking programs commonly involve the moni- 
toring and control of an external operation, such as 
the activities of a process controller or the move- 
ments of a machine tool. These programs generally 
consist of a number of processes that run indepen- 
dently of one another, but share a common data- 
base or pass data among themselves. 

The 80960MC offers several hardware functions de- 
signed to support multitasking systems. One unique 
feature, called self-dispatching, allows a processor 
to switch itself automatically among scheduled 
tasks. When self-dispatching is used, all the operat- 
ing system is required to do is place the task in the 
scheduling queue. 

When the processor becomes available, it dis- 
patches the task from the beginning of the queue 
and then executes it until it becomes blocked, inter- 
rupted, or until its time-slice expires. It then returns 
the task to the end of the queue (i.e., automatically 
reschedules it) and dispatches the next ready task. 


During these operations, no communication be- 
tween the processor and the operating system is 
necessary until the running task is complete or an 
interrupt is issued. 


Synchronization and Communication 

The 80960MC also offers instructions to set up and 
test semaphores to ensure that concurrent tasks 
remain synchronized and no data inconsistency 
results. Special data structures, known as communi- 
cation ports, provide the means for exchanging 
parameters and data structures. Transmission of in- 
formation by means of communication ports is asyn- 
chronous and automatically buffered by the proces- 
sor. 


Communication between tasks by means of ports 
can be carried out independently of the operating 
system. Once the ports have been set up by the 
programmer, the processor handles the message 
passing automatically. 



High Bandwidth Locai Bus 


An 80960MC CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system Interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to interrupts. Its features include: 

® 32-bit multiplexed address/data path 

® Four-word burst capability, which allows transfers 
from 1 to 1 6 bytes at a time 

® High bandwidth reads and writes at 66.7 MBytes 
per second 

® Special signal to indicate whether a memory 
transaction can be cached 


Figure 5 Identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the Interrupt lines. 
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271080-3 

Figure 5. Local Bus Signal Groups 


Multiple Processor Support 

One means of increasing the processing power of a 
system is to run two or more processors in parallel. 
Since microprocessors are not generally designed to 
run in tandem with other processors, designing such 
a system is usually difficult and costly. 

The 80960MC solves this problem by offering a 
number of functions to coordinate the actions of 
multiple processors. First, messages can be passed 
between processors to initiate actions such as flush- 
ing a cache, stopping or starting another processor, 
or preempting a task. The messages are passed on 
the bus and allow multiple processors to run togeth- 
er smoothly, with rare need to lock the bus or memo- 
ry. 

Second, a set of synchronization instructions help 
maintain the coherency of memory. These instruc- 
tions permit several processors to modify memory at 
the same time without inserting inaccuracies or am- 
biguities into shared data structures. 

The self-dispatching mechanism, in addition to being 
used in single-processor systems, provides the 
means to increase the performance of a system 
merely by adding processors. Each processor can 
either work on the same pool of tasks (sharing the 
same queue with other processors) or can be re- 
stricted to its own queue. 

When processors perform system operations, they 
synchronize themselves by using atomic operations 
and sending special messages between each other. 
And changing the number of processors In a system 


never requires a software change. Software will exe- 
cute correctly regardless of the number of proces- 
sors In the system; systems with more processors 
simply execute faster. 


Interrupt Handling 

The 80960MC can be Interrupted in one of two 
ways: by the activation of one of four interrupt pins 
or by sending a message on the processor’s data 
bus. 

The 80960MC is unusual in that it automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide M8259A handshaking for expansion beyond 
four interrupt lines. 

An interrupt message is made up of a vector number 
and an interrupt priority. If the interrupt priority is 
greater than that of the currently running task, the 
processor accepts the interrupt and uses the vector 
as an index into the interrupt table. If the priority of 
the Interrupt message is below that of the current 
task, the processor saves the information in a sec- 
tion of the interrupt table reserved for pending inter- 
rupts. 


Debug Features 

The 80960MC has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
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internal 32-bit registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 

The 80960MC has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine Is called automatically. 

The 80960MC also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 

Tracing is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special debug instruction. In each case, the 
80960MC executes the instruction first and then 
calls a trace handling routine (usually part of a soft- 
ware debug monitor). Further program execution is 
halted until the trace routine Is completed. When the 
trace event handling routine Is completed, Instruc- 
tion execution resumes at the next Instruction. The 
80960MC’s tracing mechanisms, which are imple- 
mented completely In hardware, greatly simplify the 
task of testing and debugging software. 


FAULT DETECTION 

The 80960MC has an automatic mechanism to 
handle faults. There are ten fault types including 
trace, arithmetic, and floating-point faults. When the 
processor detects a fault, it automatically calls the 
appropriate fault handling routine and saves the cur- 
rent instruction pointer and necessary state informa- 
tion to make efficient recovery possible. The proces- 
sor posts diagnostic information on the type of fault 
to a Fault Record. Like Interrupt handling routines, 
fault handling routines are usually written to meet 
the needs of a specific application and are often in- 
cluded as part of the operating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific Information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 


Interagent Communications (lAC) 

In order to coordinate their actions, processors in a 
multiple processor system need a means for com- 
municating with each other. The 80960MC does this 
through a mechanism known as Interagent Commu- 
nication messages or lACs. 

lAC messages cause a variety of actions including 
starting and stopping processors, flushing instruc- 
tion caches and TLBs, and sending interrupts to oth- 
er processors in the system. The upper 1 6 Mbytes of 
the processor’s physical memory space is reserved 
for sending and receiving lAC messages. 


BUILT-IN TESTABILITY 

Upon reset, the 80960MC automatically conducts an 
exhaustive internal test of its major blocks of logic. 


Then, before executing its first instruction, it does a 
zero check sum on the first eight words in memory 
to ensure that the system has been loaded correctly. 
If a problem Is discovered at any point during the 
self-test, the 80960MC will assert its FAILURE pin 
and will not begin program execution. The self-test 
takes approximately 47,000 cycles to complete. 



System manufacturers can use the 80960MC’s self- 
test feature during incoming parts inspection. No 
special diagnostic programs need to be written, and 
the test Is both thorough and fast. The self-test ca- 
pability helps ensure that defective parts will be dis- 
covered before systems are shipped, and once In 
the field, the self-test makes it easier to distinguish 
between problems caused by processor failure and 
problems resulting from other causes. 


COMPATIBILITY WITH 80960K-SERIES 

Application programs written for the 80960K-Series 
microprocessors can be run on the 80960MC with- 
out modification. The 80960K-Serles instruction set 
forms the core of the 80960MC’s instructions, so bi- 
nary compatibility Is assured. 
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CHMOS 

The 80960MC is fabricated using Intel’s CHMOS IV 
(Complementary High Speed Metal Oxide Semicon- 
ductor) process. This advanced technology elimi- 
nates the frequency and reliability limitations of older 


CMOS processes and opens a new era in micro- 
processor performance. It combines the high per- 
formance capabilities of Intel’s industry-leading 
HMOS technology with the high density and low 
power characteristics of CMOS. The 80960MC is 
available at 16, 20 and 25 MHz. 


Table 4a. 80960MC Pin Description: L-Bus Signals 


Symbol 

Type 

Name and Function 

CLK 2 

I 

SYSTEM CLOCK provides the fundamental timing for 80960MC systems. It is 
divided by two inside the 80960MC to generate the internal processor clock. CLK2 
is shown in Figure 9. 

LAD 31 

-LADo 

I/O 

T.S. 

LOCAL ADDRESS/DATA BUS carries 32-bit physical addresses and data to and 
from memory. During an address (Tg) cycle, bits 2-31 contain a physical word 
address (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 
contain read or write data. The LAD lines are active HIGH and float to a high 
impedance state when not active. 

SIZE, which is comprised of bits 0-1 of the LAD lines during a Ta cycle, specifies 
the size of a transfer in words for a burst transaction. 

LAPi LADq 

0 0 1 Word 

0 1 2 Words 

1 0 3 Words 

1 1 4 Words 

ALE 

0 

T.S. 

ADDRESS-LATCH ENABLE indicates the transfer of a physical address. ALE Is 
asserted during a Tq cycle and deasserted before the beginning of the Td state. It 
is active LOW and floats to a high impedance state when the processor is idle or 
is at the end of any bus access. 

ADS 

0 

O.D. 

ADDRESS STATUS Indicates an address state. ADS Is asserted every Tg state 
and deasserted during the the following Td state. For a burst transaction, ADS is 
asserted again every Td state where READY was asserted in the previous cycle. 

W/R 

0 

O.D. 

WRITE/READ specifies, during a Tg cycle, whether the operation is a write or 
read. It is latched on-chip and remains valid during Td and T^ states. 

DT/R 

0 

O.D. 

DATA TRANSMIT/RECEIVE indicates the direction of data transfer to and from 
the L-Bus. It is low during Tg, T^ and Td cycles for a read or interrupt _ 
acknowledgement; it is high during Tg, T^ and Td cycles for a write. DT/R never 
changes state when DEN is asserted (see Timing Diagrams). 

DEN 

0 

O.D. 

DATA ENABLE is asserted during Td and Tw cycles and indicates transfer of data 
on the LAD bus lines. 

READY 

I 

READY indicates that data on LAD lines can be sampled or removed. If READY is 
not asserted during a Td cycle, the Td cycle is extended to the next cycle by 
inserting wait states (Tw), and ADS is not asserted in the next cycle. 

LOCK 

I/O 

O.D. 

BUS LOCK prevents other bus masters from gaining control of the L-Bus 
following the current cycle (If they would assert LOCK to do so). LOCK Is used by 
the processor or any bus agent when It performs indivisible Read/ Modify/ Write 
(RMW) operations. 

For a read that is designated as a RMW-read, LOCK is examined, if asserted, the 
processor waits until it Is not asserted; if not asserted, the processor asserts 

LOCK during the Tg cycle and leaves it asserted. 

A write that is designated as an RMW-write deasserts LOCK in the Tg cycle. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = three state 
Ta ~ TAddressi Td = Tpatai Tw = Tyvait* Tr = Tpecovery* T = T|diei Th = Tnold 
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Table 4a. 80960MC Pin Description: L-Bus Signals (Continued) 


Symbol 

Type 

Name and Function 


0 

O.D. 

BYTE ENABLE LINES specify which data bytes (up to four) on the bus take part 
in the current bus cycle. BE3 corresponds to LAD31 -LAD24 and BEq corresponds 
to LAD7-LAD0. 

The byte enables are provided in advance of data. The byte enables asserted 
during T^ specify the bytes of the first data word. The byte enables asserted 
during Td specify the bytes of the next data word (if any), that is, the word to be 
transnnitted following the next assertion of READY. The byte enables during the 

Td cycles preceding the last assertion of READY are undefined. The byte enables 
are latched on-chip and remain constant from one Td cycle to the next when 

READY is not asserted. 

For reads, the byte enables specify the byte(s) that the processor will actually use. 
80960MC’s will assert only adjacent byte enables (e.g., asserting just BEq and 

BE2 is not permitted), and are required to assert at least one byte enable. 

Accesses must also be naturally aligned (e.g., asserting BEi and BE2 is not 
allowed even though they are adjacent). To produce address bits Aq and Ai 
externally, they can be decoded from the byte enables. 

HOLD 

(HLDAR) 

1 

HOLD indicates a request from a secondary bus master to acquire the bus. If the 
processor is initialized as the primary bus master this input will be interpreted as 
HOLD. When the processor receives HOLD and grants another master control of 
the bus, it floats its three-state bus lines, asserts HOLD ACKNOWLEDGE, and 
enters the Th state. When HOLD is deasserted, the processor will deassert HOLD 
ACKNOWLEDGE and go to either the Tj or T^ state. 

HOLD ACKNOWLEDGE RECEIVED indicates that the processor has acquired 
the bus. If the processor is initialized as the secondary bus master this input is 
interpreted as HLDAR. 

HOLD timing is shown in Figure 1 1 . 

HLDA 

(HOLDR) 

0 

T.S. 

HOLD ACKNOWLEDGE relinquishes control of the bus to another bus master. If 
the processor is initialized as the primary bus master this output will be Interpreted 
as HLDA. When HOLD is deasserted, the processor will deassert HLDA and go to 
either the Tj or Tg state. 

HOLD REQUEST indicates a request to acquire the bus. If the processor is 
initialized as the secondary bus master this output will be interpreted as HOLDR. 

HOLD timing is shown in Figure 1 1 . 

CACHE 

0 

T.S. 

CACHE indicates if an access is cacheable during a Ta cycle. The CACHE signal 
floats to a high impedance state when the processor is idle. 


I/O = Input/Output, O = Output, 1 = Input, O.D. = Open-Drain, T.S. = three state 
Ta ~ TAddress' Td = Toata- Tw “ Twait* Tf = Tpecoveryi T = T|(j|e, Th = Tnold 
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Table 4b. 80960MC Pin Description: Module Support Signals 


Symbol 

Type 

Name and Function 

BADAC 

I 

BAD ACCESS, if asserted in the cycle following the one in which the last READY 
of a transaction is asserted, indicates that an unrecoverable error has occurred on 
the current bus transaction, or that a synchronous load/store instruction has not 
been acknowledged. 

STARTUP: During system reset, the BADAC signal is interpreted differently. If the 
signal Is high, it indicates that this processor will perform system initialization. If it 
is low, another processor In the system will perform system initialization Instead. 

RESET 

I 

RESET clears the internal logic of the processor and causes it to re-initialize. 

During RESET assertion, the input pins are ignored (except for BADAC and 
lAC/INTo), the tri-state output pins are placed in a high impedance state, and 
other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable RESET, 

The HIGH to LOW transition of RESET should occur after the rising edge of both 
CLK2 and the external bus CLK, and before the next rising edge of CLK2. 

RESET timing is shown in Figure 10. 

FAILURE 

0 

O.D. 

INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. After RESET is deasserted and before the first bus transaction begins, 
FAILURE is asserted while the processor performs a self-test. If the self-test 
completes successfully, then FAILURE Is deasserted. Next, the processor 
performs a zero checksum on the first eight words of memory. If it fails, FAILURE 
is asserted for a second time and remains asserted; if It passes, system 
initialization continues and FAILURE remains deasserted. 

N.C. 

N/A 

NOT CONNECTED indicates pins should not be connected. Never connect any 
pin marked N.C. 

lAC 

(INTo) 

I 

INTERAGENT COMMUNICATION REQUEST/INTERRUPT 0 indicates either 
that there is a pending lAC message for the processor or an Interrupt. The bus 
Interrupt control register determines in which way the signal should be Interpreted. 

To signal an interrupt or lAC request in a synchronous system, this pin (as well as 
the other interrupt pins) must be enabled by being deasserted for at least one bus 
cycle and then asserted for at least one additional bus cycle; in an asynchronous 
system, the pin must remain deasserted for at least two bus cycles and then be 
asserted for at least two more bus cycles. 

LOCAL PROCESSOR NUMBER: This signal is interpreted differently during 
system reset. If the signal is at a high voltage level, it indicates that this processor 

Is a primary bus master (Local Processor Number = 0); If it is at a low voltage 
level, It Indicates that this processor is a secondary bus master (Local Processor 
Number =1). 

INTi 

I 

INTERRUPT 1 , like INTq, provides direct Interrupt signaling. 

INT 2 

(INTR) 

I 

INTERRUPT 2/INTERRUPT REQUEST: The bus control registers determines 
how this pin is interpreted. If INT 2 , It has the same interpretation as the INTq and 

INTI pins. If INTR, it is used to receive an interrupt request from an external 
interrupt controller. 

iNTi 

(INTA) 

I/O 

O.D. 

INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The bus Interrupt control register 
determines how this pin Is interpreted. If INT 3 , it has the same interpretation as 
the INTq, INT-i, and INT 2 pins. If INTA, it is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles; as an output, it is open-drain. 


I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = three state 
~ TAddress> ^d = Toata* ~ Twait* Tr = Tpecovery. Tj = T|diei = ThqIU 
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ELECTRICAL SPECIFICATIONS 
Power and Grounding 

The 80960MC is implemented in CHMOS ill technol- 
ogy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 12 Vcc 
and 13 Vss pins separately feed functional units of 
the 80960MC. 

Power and ground connections must be made to all 
power and ground pins of the 80960MC. On the cir- 
cuit board, all Vcc P'^s must be strapped closely 
together, preferably on a power plane. Likewise, all 
Vss pins should be strapped together, preferably on 
a ground plane. 


Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960MC. The processor can cause tran- 
sient power surges when driving the L-Bus, particu- 
larly when it is connected to a large capacitive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and decou- 
pling capacitors as much as possible. 


Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 


one or more interrupt lines are not used, they should 
be pulled up or down to their respective deasserted 
states. No inputs should ever be left floating. 

All open-drain outputs require a pullup device. While 
In some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid V|h 3.4V) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of lOOH. The advan- 
tage of terminating the output signals in this fashion 
is that It limits signal swing and reduces AC power 
consumption. 


Characteristic Curves 

Figure 7 shows the typical supply current require- 
ments over the operating temperature range of the 
processor at supply voltage (Vcc) 5V. Figure 8 
shows the typical power supply current (Ice) *'©- 
quired by the 80960MC at various operating fre- 
quencies when measured at three input voltage 
(Vcc) levels. 

Figure 9 shows the typical capacitive derating curve 
for the 80960MC measured from 1 .5V on the system 
clock (CLK) to 0.8V on the falling edge and 2.0V on 
the rising edge of the L-Bus address/data (LAD) sig- 
nals. 
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Low Drive Network: 


High Drive Network: 

• VoH = 3.42V 


• Vqh ~ 3.41V 

• Iql 25.3 mA 


• Iql = 33.8 mA 


Figure 6. Connection Recommendations for Low and High Current Drive Networks 
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Figure 7. Typical Supply Current (Ice) 


Test Load Circuit 

Figure 10 illustrates the load circuit used to test the 
80960MC’s tristate pins, and Figure 11 shows the 
load circuit used to test the. open drain outputs. The 
open drain test uses an active load circuit in the form 
of a matched diode bridge. Since the open-drain out- 
puts sink current, only the Iql •©QS of the bridge are 
necessary and the Iqh IoQS are not used. When the 
80960MC driver under test is turned off, the output 
pin is pulled up to Vref (••©•. Voh)- Diode Di is 
turned off and the Iql current source flows through 
diode D 2 . 

When the 80960MC open-drain driver under test is 
on, diode Di is also on, and the voltage on the pin 
being tested drops to Vql- Diode D 2 turns off and 
Iql flows through diode Di. 
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Figure 8. Typical Current vs Frequency 
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Figure 10. Test Load Circuit for 
TRI-STATE Output Pins 



Figure 1 1. Test Load Circuit for Open-Drain Output Pins 
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ABSOLUTE MAXIMUM RATINGS* 

Case Temperature 

under Bias(7) -55°C to + 125“C 

Storage Temperature -65°C to + 1 50®C 

Voltage on Any Pin -0.5V to Vcc + 0-5V 

Power Dissipation 2.6W (25 MHz) 

D.C. CHARACTERISTICS 

80960MC: TcaSE^^^ = -55“Cto +125‘’C, Vcc = 5V 


NOTICE: This data sheet contains information on 
products in the sampling and initial production phases 
of development. The specifications are subject to 
change without notice. Verify with your local Intel 
Sales office that you have the latest data sheet be- 
fore finalizing a design. 

* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 

± 5% 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

V|L 

Input Low Voltage 

-0.3 

+ 0.8 

V 


V|H 

Input High Voltage 

2.0 

Vcc + 0.3 

V 


VcL 

CLK2 Input Low Voltage 

-0.3 

+ 0.8 

V 


VCH 

CLK2 Input High Voltage 

0.55 Vcc 

Vcc + 0.3 

V 


VoL 

Output Low Voltage 


0.45 

V 

(1.5) 

VoH 

Output High Voltage 

2.4 


V 

(2,4) 

•cc 

Power Supply Current: 






16 MHz 


375 

mA 



20 MHz 


420 

mA 



25 MHz 


480 

mA 


Ili 

Input Leakage Current 


±15 

jllA 

0 ^ Vq ^ Vcc 

•lo 

Output Leakage Current 


±15 

fjiA 

0.45 ^ Vq ^ Vcc 

C|N 

Input Capacitance 


10 

PF 

fc = 1 MHz(3) 

Co 

I/O or Output Capacitance 


12 

pF 

fc = 1 MHz(3) 

CcLK 

Clock Capacitance 


10 

pF 

fc = 1 MHz(3) 

^JA 

Thermal Resistance 
(Junction-to-Ambient) 






Pin Grid Array 


21 

‘’C/W 



Ceramic Quad Flatpack 


29 

"C/W 


®JC 

Thermal Resistance 
(Junction-to-Case) 






Pin Grid Array 


4 

°c/w 



Ceramic Quad Flatpack 


8 

°c/w 




NOTES: 

1 . For three-state outputs, this parameter is measured at: 

Address/ Data 

Controls 

2. This parameter is measured at: 

Address/ Data 

Controls 

ALE 

3. Input, output, and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. For open-drain outputs 

6. Case temperatures are “instant on”. 


. .4.0 mA 
. .5.0 mA 

-1.0 mA 
-0.9 mA 
-5.0 mA 


25 mA 
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AC SPECIFICATIONS reaches (for output delay and input setup) or leaves 

(for hold time) the TTL levels of LOW (0.8V) of HIGH 
This section describes the AC specifications for the (2.0V). All AC testing should be done with input volt- 

80960MC pins. All input and output timings are ages of 0.4V and 2.4V, except for the clock (CLK2), 

specified relative to the 1 .5V level of the rising edge which should be tested with input voltages of 0.45V 

of CLK2, and refer to the time at which the signal and 0.55 Vcc- 


EDGE A B C D A B C 



NOTE 1: 

For Tri-State pins, Te and Tg are measured at 1.5V. 

For Open-Drain pins, is measured at 1.5V, Tg at 0.8V. 

Figure 12. Drive Levels and Timing Relationships for 80960MC Signals 
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A.C. Specification Tabies 

80960MC A.C. Characteristics (16 MHz) 

TcaSE^^) = -55°C to +125°C, Vcc = 5V ±5% 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 
Period (CLK2) 

31.25 

125 

ns 

V|N = 1.5V 

T2 , 

Processor Clock 

Low Time (CLK2) 

8 


ns 

V|L = 10% Point 
= 1.2V 

T3 

Processor Clock 

High Time (CLK2) 

8 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 

Point 

Ts 

Processor Clock 

V Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 

Point 

Te 

Output Valid 

Delay 

2 

25 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Tsh 

HOLDA Output 

Valid Delay 

4 

31 

ns 

Cl = 75 pF 

Ty 

ALE Width 

15 


ns 

Cl = 75 pF 

Te 

ALE Invalid Delay 

0 

20 

ns 

Cl = 75 pF(2) 

T9 

Output Float 

Delay 

2 

20 

ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls)(2) 

TgH 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 75 pF 

T10 

Input Setup 1 

3 


ns 

(Note 1) 

T11 

Input Hold 

5 


ns 

(Note 1 ) 

T11H 

HOLD Input Hold 

4 


ns 


Ti2 

Input Setup 2 

8 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = 100 pF (LAD) 

Cl = 75 pF (Controls) 

Ti4 

Hold after ALE 
Inactive 

8 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Tie 


5 


ns 


Ti7 

Reset Width 

1281 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. IAC/INTq, INT-j, INT2/INTR, INT3 can be asynchronous. 

2 . A float condition occurs when the maximum output current becomes less than Ilq. Float delay is not tested, but should be 
no longer than the valid delay. 

3 . Case temperatures are “iristant on”. 
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A.C. Specification Tabies (Continued) 
80960MC A.C. Characteristics (20 MHz) 

TcaSE^^) = -55"CtO +125°C, Vcc = 5V ±5% 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 
Period (CLK2) 

25 

125 

ns 

ViN = 1.5V 

Ta 

Processor Clock 

Low Time (CLK2) 

6 


ns 

V|L = 10% Point 
= 1.2V 

T 3 

Processor Clock 

High Time (CLK2) 

6 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T 4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 10% 
Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 90% 

Point 

Te 

Output Valid 

Delay 

2 

20 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

TsH 

HOLDA Output 

Valid Delay 

4 

26 

ns 

Cl = 50 pF 

T 7 

ALE Width 

12 


ns 

Cl = 50 pF 

Ts 

ALE Invalid Delay 

0 

20 

ns 

CV 

LL 

Q. 

0 

If) 

II 

_J 

0 

Tg 

Output Float 

Delay 

2 

20 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls)(2) 

T 9 H 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 50 pF 

T 10 

Input Setup 1 

3 


ns 

(Note 1 ) 

T 11 

Input Hold 

5 


ns 

(Note 1 ) 

T 1 IH 

HOLD Input Hold 

4 


ns 


Ti2 

Input Setup 2 

7 


ns 


Ti3 

Setup to ALE 

Inactive 

10 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti4 

Hold after ALE 
Inactive 

8 


ns 

Cl == 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti5 

Reset Hold 

3 


ns 


Ti6 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

1025 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1. lAC/INTo, INTi, INT2/INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3 . Case temperatures are “instant on”. 
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A.C. Specification Tabies (Continued) 
80960MC A.C. Characteristics (25 MHz) 

TcasE^^^ = -SS^Cto +125“C, Vcc = 5V ±5% 


Symbol 

Parameter 

Min 

Max 

Units 

Test Conditions 

Ti 

Processor Clock 
Period (CLK2) 

20 

125 

ns 

V|n= 1.5V 

T 2 

Processor Clock 

Low Time (CLK2) 

5 


ns 

V|L = 10% Point 
= 1.2V 

T 3 

Processor Clock 

High Time (CLK2) 

5 


ns 

V|H = 90% Point 
= 0.1V + 0.5 Vcc 

T 4 

Processor Clock 

Fall Time (CLK2) 


10 

ns 

V|N = 90% Point to 

10% Point 

Ts 

Processor Clock 

Rise Time (CLK2) 


10 

ns 

V|N = 10% Point to 

90% Point 

Te 

Output Valid 

Delay 

2 

19 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

TsH 

HOLDA Output 

Valid Delay 

4 

24 

ns 

Cl = 50 pF 

T 7 

ALE Width 

12 


ns 

Cl = 50 pF 

Ts 

ALE Invalid 

Delay 

0 

20 

ns 

CVJ 

LL 

Q. 

0 

10 

11 

_l 

0 

Tg 

Output Float 

Delay 

2 

19 

ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls)(2) 

TgH 

HOLDA Output 

Float Delay 

4 

20 

ns 

Cl = 50 pF 

T 10 

Input Setup 1 

3 


ns 

(Note 1 ) 

T 11 

Input Hold 

5 


ns 

(Notel) 

T 11 H 


4 


ns 


Ti2 

Input Setup 2 

7 


ns 


Ti3 

Setup to ALE 

Inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti4 

Hold after ALE 
Inactive 

8 


ns 

Cl = 60 pF (LAD) 

Cl = 50 pF (Controls) 

Ti 5 

Reset Hold 

3 


ns 


Ti6 

Reset Setup 

5 


ns 


Ti7 

Reset Width 

onrv 


ns 

41 CLK2 Periods Minimum 


NOTES: 

1 . lAC/INTo, INTi, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Case temperatures are “instant on”. 
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Figure 15. RESET Signal Timing 
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Figure 16. Hold Timing 


Design Considerations 

Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quen t output from the processor is deasserted (e.g., 
DEN becomes deasserted). 


When designing an 80960MC hardware system that 
uses the ICE-960MC to debug the system, several 
electrical and mechanical characteristics should be 
considered. These considerations Include capacitive 
loading, drive requirement, power requirement, and 
physical layout. 


In other words, whenever the processor generates 
an output that indicates a transition into a subse- 
quent state, the processor must have sampled any 
inputs for the previous state. 

Similarly, whenever the processor generates an out- 
put that indicates a transition into a subsequent 
state, any outputs that are specified to be three stat- 
ed In this new state are guaranteed to be three stat- 
ed. 

Designing for the ICE-960MC 

The 80960MC In-Circuit Emulator assists in debug- 
ging 80960MC hardware and software designs. The 
product consists of a probe module, cable, and con- 
trol unit. Because of the high operating frequency of 
80960MC systems, the probe module connects di- 
rectly to the 80960MC socket. 


The ICE-960MC probe module increases the load 
capacitance of each line by up to 25 pF. It also adds 
one standard Schottky TTL load on the CLK2 line, 
up to one advanced low-power Schottky TTL load 
for each control signal line, and one advanced low- 
power Schottky TTL load for each address/data and 
byte enable line. These loads originate from the 
probe module and are driven by the 80960MC proc- 
essor. 

To achieve high noise Immunity, the ICE-960MC 
probe is powered by the user’s system. The high- 
speed probe circuitry draws up to 1 .1 A plus the max- 
imum current (Ice) the 80960MC processor. 

The mechanical considerations are shown in Figure 
17, which illustrates the lateral clearance require- 
ments for the ICE-960MC probe as viewed from 
above the socket of the 80960MC processor. 
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Figure 17. ICE-960MC Lateral Clearance Requirements 


MECHANICAL DATA 


Pin Assignment 

The 80960MC is packaged in a 132-lead ceramic pin 
grid array and a 1 64-lead ceramic quad flatpack. The 
80960MC pin grid array pinout as viewed from the 
substrate side of the component is shown in Figure 
18 and from the pin side in Figure 19. The 80960MC 
ceramic quad flatpack pinout as viewed from the top 
of the package is shown in Figure 20. 

Vcc and GND connections must be made to multi- 
ple Vcc and GND pins. Each Vqc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. Pref- 
erably, the circuit board should include power and 
ground planes for power distribution. Tables 5, 6, 7 
and 8 list the function of each pin. 

NOTE: 

Pins identified as N.C., “No Connect,” should never 
be connected under any circumstances. 


Package Dimensions and Mounting 

Pins in the pin grid array package are arranged 
0.100 inch (2.54mm) center-to-center, in a 14 by 14 
matrix, three rows around. (See Figure 21 .) 

A wide variety of available sockets allow low-inser- 
tion or zero-insertion force mountings, and a choice 
of terminals such as soldertail, surface mount, or 
wire wrap. Several applicable sockets are shown in 
Figure 22. 


Package Thermal Specification 

The 80960MC is specified for operation when Its 
case temperature is within the range of -55°C to 
+ 125®C. The PGA case temperature should be 
measured at the center of the top surface opposite 
the pins as shown in Figure 23. The ceramic quad 
flatpack case temperature should be measured at 
the center of the lid on the top surface of the pack- 
age. 


WAVEFORMS 

Figures 24 through 30 show the waveforms for vari- 
ous transactions on the 80960MC’s local bus. 
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Figure 18. MG80960MC Pinout — View from Top (Pins Facing Down) 
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Figure 19. MG80960MC Pinout— View from Bottom (Pins Facing Up) 
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Table 5. MG80960MC (PGA) Pinout—ln Pin Order 



NOTE: 

Pins identified as N.C. (“No Connect”) should never be connected under any circumstances. 
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Table 6. MG80960MC (PGA) Pinout— In Signal Order 



NOTE: 

Pins identified as N.C. (“No Connect”) should never be connected under any circumstances. 
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Table 7. MQ80960MC (CQP) Pinout— In Pin Order 


Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

1 

BE^ 

42 

LAD11 

83 

N.C. 

124 

N.C. 

2 

BE3 

43 

LAD12 

84 

Vcc 

125 

Vss 

3 

READY 

44 

LAD9 

85 

N.C. 

126 

Vcc 

4 

BET 

45 

LAD10 

86 

N.C. 

127 

N.C. 

5 

CACHE 

46 

LAD7 

87 

Vss 

128 

N.C. 

6 

DT/R 

47 

LAD8 

88 

N.C. 

129 

N.C. 

7 

LAD31 

48 

LADs 

89 

N.C. 

130 

N.C. 

8 

W/R 

49 

LADfi 

90 

N.C. 

131 

N.C. 

9 

LAD29 

50 

LAD4 

91 

N.C. 

132 

N.C. 

10 

LAD30 

51 

LADi 

92 

N.C. 

133 

N.C. 

11 

LAD27 

52 

CLK2 

93 

N.C. 

134 

N.C. 

12 

LAD28 

53 

INT2 

94 

N.C. 

135 

N.C. 

13 

MiE 

54 

LAD3 

95 

N.C. 

136 

N.C. 

14 

LAD26 

55 

LAD2 

96 

N.C. 

137 

N.C. 

15 

ADS 

56 

LADo 

97 

N.C. 

138 

N.C. 

16 

HLDA 

57 

RESET 

98 

N.C. 

139 

N.C. 

17 

N.C. 

58 

Wfs 

99 

N.C. 

140 

N.C. 

18 

Vss 

59 

INT1 

100 

Vcc 

141 

N.C. 

19 

Vcc 

60 

Vss 

101 

N.C. 

142 

N.C. 

20 

Vss 

61 

Vcc 

102 

N.C. 

143 

N.C. 

21 

Vcc 

62 

Vss 

103 

Vss 

144 

N.C. 

22 

Vcc 

63 

Vcc 

104 

N.C. 

145 

N.C. 

23 

Vss 

64 

Vss 

105 

N.C. 

146 

N.C. 

24 

Vcc 

65 

Vcc 

106 

N.C. 

147 

N.C. 

25 

Vss 

66 

Vss 

107 

N.C. 

148 

N.C. 

26 

Vcc 

67 

Vcc 

108 

N.C. 

149 

N.C. 

27 

HOLD 

68 

N.C. 

109 

N.C. 

150 

N.C. 

28 

BADAC 

69 

N.C. 

110 

N.C. 

151 

N.C. 

29 

LAD25 

70 

N.C. 

111 

N.C. 

152 

N.C. 

30 

LAD24 

71 

N.C. 

112 

N.C. 

153 

Vss 

31 

LAD23 

72 

N.C. 

113 

N.C. 

154 

Vcc 

32 

LAD21 

73 

N.C. 

114 

N.C. 

155 

N.C. 

33 

LAD22 

74 

N.C. 

115 

N.C. 

156 

N.C. 

34 

LAD19 

75 


116 

N.C. 

157 

N.C. 

35 

LAD20 

76 

N.C. 

117 

N.C. 

158 

Vss 

36 

LADi 7 

77 

N.C. 

118 

N.C. 

159 

N.C. 

37 

LAD18 

78 

N.C. 

119 

Vss 

160 

LOCK 

38 

LAD16 

79 

N.C. 

120 

Vcc 

161 


39 

LAD15 

80 


121 

N.C. 

162 

DEN 

40 

LAD14 

81 


122 

N.C. 

163 

BE2 

41 

LAD13 

82 


123 

N.C. 

164 

Vss 


NOTE: 

Pins identified as N.C. (“No Connect”) should never be connected under any circumstances. 
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Table 8. MQ80960MC (CQP) Pinout— In Signal Order 


Signal 

Pin 



Signal 


Signal 

Pin 

ADS 

15 

LAD 23 

31 

N.C. 


N.C. 

148 

ALE 

13 


30 

N.C. 


N.C. 

149 

BADAC 

28 

LAD 25 

29 

N.C. 


N.C. 

150 

BE^ 

1 

LAD 26 

14 

N.C. 


N.C. 

151 

BET 

4 

LAD 27 

11 

N.C. 


N.C. 

152 

BE2 

163 

LAD 28 

12 

N.C. 

108 

N.C. 

155 

BEi 

2 

LAD 29 

9 

N.C. 

109 

N.C. 

156 

CACHE 

5 

LAD 30 

10 

N.C. 

110 

N.C. 

157 

CLK2 

52 

LAD 31 

7 

N.C. 

111 

N.C. 

159 

DEN 

162 

LOCK 

160 

N.C. 

112 

READY 

3 

DT/R 

6 

N.C. 

■ 17 

N.C. 

113 

RESET 

57 

FAILURE 

161 

N.C. 

68 

N.C. 

114 

Vcc 

19 

HLDA/HOLDR 

16 

N.C. 

69 

N.C. 

115 

Vcc 

21 

HOLD/HLDAR 

27 

N.C. 

70 

N.C. 

116 

Vcc 

22 

Iac/InTo 

75 

N.C. 

71 

N.C. 

117 

Vcc 

24 

INTi 

59 

N.C. 

72 

N.C. 

118 

Vcc 

26 

INT 2 /INTR 

53 

N.C. 

73 

N.C. 

121 

Vcc 

61 

INT 3 /INTA 

58 

N.C. 

74 

N.C. 

122 

Vcc 

63 

LADo 

56 

N.C. 

76 

N.C. 

123 

Vcc 

65 

LADi 

51 

N.C. 

77 

N.C. 

124 

Vcc 

67 

LAD 2 

55 

N.C. 

78 

N.C. 

127 

Vcc 

84 

1 -AD 3 

54 

N.C. 

79 

N.C. 

128 

Vcc 

100 

LAD 4 

50 

N.C. 

80 

N.C. 

129 

Vcc 

120 

LAD 5 

48 

N.C. 

81 

N.C. 

130 

Vcc 

126 

LADe 

49 

N.C. 

82 

N.C. 

131 

Vcc 

154 

LAD 7 

46 

N.C. 

83 

N.C. 

132 

Vss 

18 

LADs 

47 

N.C. 

85 

N.C. 

133 

Vss 

20 

LADg 

44 

N.C. 

86 

N.C. 

134 

Vss 

23 

LAD 10 

45 

N.C. 

88 

N.C. 

135 

Vss 

25 

LAD 11 

42 

N.C. 

89 

N.C. 

136 

Vss 

60 

LAD 12 

43 

N.C. 

90 

N.C. 

137 

Vss 

62 

LAD 13 

41 

N.C. 

91 

. N.C. 

138 

Vss 

64 

LAD 14 

40 

N.C. 

92 

N.C. 

139 

Vss 

66 

LAD 15 

39 

N.C. 

93 

N.C. 

140 

Vss 

87 

LAD 16 

38 

N.C. 

94 

N.C. 

141 

Vss 

103 

LAD 17 

36 

N.C. 

95 

N.C. 

142 

Vss 

119 

LAD 18 

37 

N.C. 

96 

N.C. 

143 

Vss 

125 

LAD 19 

34 

N.C. 

97 

N.C. 

144 

Vss 


LAD 20 

35 

N.C. 

98 

N.C. 


Vss 

158 

LAD 21 

32 

N.C. 

99 

N.C. 

146 

Vss 

164 

LAD 22 

33 

N.C. 

101 


147 




NOTE: 

Pins identified as N.C. (“No Connect”) should never be connected under any circumstances. 
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Figure 21. A 132-Lead Pin-Grid Array (PGA) Used to Package the MG80960MC 
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• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in 
insertion force compared to 
machined sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in 
version 55573-2 

Amp Incorporated 
(Harrisburg, PA 17105 U.S.A 
Phone 717-564-0100) 



Cam handle locks in low profile position when MG80960MC is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 


Peel-A-Way* Mylar and Kapton 
Socket Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (thee-level) 

• Low insertion force press-fit 
CS132-05TG 

Advanced Interconnections 

(5 Division Street) 

Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 


Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MSI 32 

Molded Plastic Body KS132 
is shown below; 


FOOT PRINT NO. 132 


1.400 SO ^ 



Hh .100TYP 
14 1 14 X 3 ROWS 


271080-14 



271080-15 

Courtesy Advanced interconnections 
(Peel-A-Way Terminal Carriers 
U.S. Patent No. 4442938) 


*Peel-A-Way is a trademark of Advanced Interconnections. 


Figure 22. Several Socket Options for Mounting the MG80960MC 
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Clock Relationship 


Figure 23. Measuring MG80960MC PGA 
Case Temperature (Tc) 
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Figure 30. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 


Revision History 

1. 20 MHz timing specifications were added. 

2. Pin 158, ceramic quad pack, (see Figure 20) changed from NC (No Connect) to Vss- 
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FAULT TOLERANT BUS EXTENSION UNIT 

Military 


m Multiprocessor Support 

— Connect up to 32 Processor and 
Memory Modules in a Single System 

■ Multiple Bus Support with No External 
Logic 

— Connect up to Four 32-Bit Buses for 
High-Bandwidth Access to 
Interleaved Memory 

■ Software-Transparent Fault Tolerance 
— Recover from a Single-Point Failure 

in a Module or Bus*without Affecting 
Program Execution 

■ Cache Control Support 

— Provides Directory, Coherency 
Logic, and Control Signals for a 
Two-Way Set-Associative Cache 
— Single BXU Supports 16 Kbytes 
— Combine up to Four BXUs to 
Support 64 Kbytes 


■ Message Passing 

— Supports Interagent Communication 
— Redundant Error Reporting Network 

■ Two I/O Prefetch Channels 

— Provides High-Bandwidth, Low 
Latency Access to Memory or I/O 
for Sequential Transfers 

■ Memory Module Support 

— Interfaces Discrete Memory 
Controller and DRAM Array to AP- 
Bus 

■ Advanced CHMOS III Technology 

■ Advanced Package Technology 

— 132 Lead Ceramic Pin Grid Array 
— 164 Lead Ceramic Quad Flatpack 

■ Military Temperature Range: 

-55°C to +125X (Tc) 


The M82965 Bus Extension Unit (BXU) is the key to building multiprocessor and fault-tolerant systems with the 
80960MC 32-bit microprocessor. BXUs connect to each other in an expandable matrix that can support up to 
32 processor and memory modules in a single, high-performance system. No external interface logic is re- 
quired. The BXU increases overall system performance by providing hardware support for local caches, I/O 
prefetch, message passing, and multiprocessor arbitration. Through redundant modules, fault-tolerant systems 
based on the BXU can sustain a single-point failure and then reconfigure themselves automatically, while 
application programs continue undlsrupted. Truly a VLSI building block, the M82965 BXU supports a wide 
range of fault tolerance and performance options to meet a diverse set of cost, performance, and reliability 
needs. 



TRAFFIC 

CONTROL 

LOGIC 


TRAFFIC CONTROL BUS 


INTERNAL ADDR/DATA BUS 



<> 

LOCAL 

REGISTER 

SET 


lAC 

SUPPORT 

LOGIC 

__5> 


o 





CACHE 

CONTROL 

LOGIC 


I/O 

PREFETCH 

LOGIC 


FAULT 

TOLERANCE 

LOGIC 


o 






AP-BUS 
I INTERFACE 




P-BUsS 


271082-1 


Figure 1. M82965 Block Diagram 
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FUNCTIONAL OVERVIEW 

The M82965 Bus Extension Unit (BXU) is the key 
component in building multiprocessor and fault-toler- 
ant system designs with the 80960MC 32-bit micro- 
processor. Its primary function is to connect the Lo- 
cal bus (L-Bus) of a system module to a system-wide 
bus called the Advanced Processor Bus (AP-Bus), 
allowing the system to expand incrementally as 
each new module or AP-Bus is added. 

Several important features are provided within the 
BXU which streamline 80960MC multiprocessor sys- 
tem operation. To increase the available system bus 
bandwidth, multiple BXUs can be employed within 
each system module to support up to four AP-Buses. 
To reduce AP-Bus traffic, BXU components can di- 
rectly support a two-way set-associative cache. I/O 
prefetch channels are incorporated within each BXU 
to reduce the time necessary to transfer large blocks 
of data from shared system memory or I/O. BXUs 
support processor-to-processor communication by 
recognizing, storing, and exchanging Interagent 
Communication (lAC) messages with other BXUs 
along the AP-Bus. Requests for access to the AP- 
Bus are resolved through BXU arbitration logic 
which ensures that no system modules will suffer 
from resource starvation. 

BXUs support fault tolerant system operation 
through several mechanisms used to detect, isolate 
and recover from hardware errors. Paired BXUs 
monitor each other’s operation on a cycle-by-cycle 


basis through a method called Functional Redun- 
dancy Checking (FRC). Errors on the AP-Bus are 
detected through interlaced parity bits on the ad- 
dress/data and control lines, signal duplication on 
the transaction control lines, and a bus timer used to 
monitor the bus for non-response to a request. Re- 
covery mechanisms include the capability to marry 
FRC modules in a primary-shadow pair (Quad Modu- 
lar Redundancy), so that if either fails, the surviving 
spouse can take over operations immediately. Tran- 
sient errors on the AP-Bus are automatically retried, 
and in the case of permanent errors, the failed bus is 
disabled and all memory accesses switched to a 
backup bus. 


MULTIPROCESSOR SUPPORT 


A multiprocessor 80960MC system is composed of 
a set of modules connected to an AP-Bus. Figure 2 
shows the three possible types of modules: active, 
passive, and the combination of both an active and 
passive module. Active modules contain up to two 
80960MC processors, cache or private memory, and 
a BXU. Passive modules contain a memory array 
and controller and a BXU. Active/ Passive modules 
contain either processors and global memory, or 
master and slave I/O devices. 



ACTIVE MODULE PASSIVE MODULE ACTIVE/PASSIVE MODULE ACTIVE/PASSIVE MODULE 



AP-BUS, 

271082-2 


Figure 2. Types of Modules 
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Local Bus 

In a multiprocessor system each module has its own 
Local Bus (L-Bus), which is typically confined to a 
single board. The L-Bus is provided to interconnect 
components within a module. It is a 32-bit multi- 
plexed, synchronous bus with a maximum bandwidth 
of 43 Mbytes per second at 16 MHz. It has been 
designed to interface with standard support compo- 
nents using minimal glue logic. The L-Bus uses 
HOLD/HOLDA for arbitration with bus sla ves and 
LOCK for signaling indivisible operations. A READY 
signal can be used to lengthen bus transactions. 


Local Bus protocol permits both primary and sec- 
ondary bus masters to coexist on the bus (often a 
processor and a DMA, or occasionally two proces- 
sors). A secondary bus master must obtain use of 
the L-Bus from the bus master through the use of 
HOLDR/HOLDAR. A BXU is always used as a mas- 
ter in a memory module and is generally used as a 
slave in a processor module. Fifty BXU pins are ded- 
icated to L-Bus and module support operations (in- 
cluding cache control). The L-Bus control registers 
are shown in Table 1. 


Table 1. L-Bus Control Registers 


Register 

Description 

Physical-ID (Local) 

This register contains a unique identifier for a specific BXU on the L-Bus. It 
corresponds to the AP-Bus Physical-ID register. 

Logical-ID (Local) 

This register holds the Logical-ID of the BXU. It corresponds to the AP-Bus 
Logical-ID register. 

LBI Control 

This is the major control register for BXU functions on the L-Bus. It is used to 
set the Interleaving factor for the cache, determines if the BXU should act as 
a master on the L-Bus, and indicates whether the BXU is in memory or 
processor mode. 

System Bus ID 

This register uniquely identifies the BXU as attached to one of four AP-Buses. 

Local-Bus Test 

This register allows system diagnostics to check on the type of recognition 
that was done on the previous L-Bus request. 

Match 0 

The contents of this register determine which bits in the L-Bus address should 
be recognized by the BXU. This register provides a base address for a 
partition of memory recognized by the BXU. 

MaskO 

The contents of this register determine if certain bits in the Match 0 register 
should be Ignored (i.e., marked “don’t care”) during address recognition. 

Match 1 

Same function as Match Register 0. 

Mask 1 

Same function as Mask Register 0. 

Match 2 

Same function as Match Register 0. 

Mask 2 

Same function as Mask Register 0. 

Private Memory Match 

Private memory address recognizer. 

Private Memory Mask 

Private memory mask register. 
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Advanced Processor Bus 

A highly optimized multiprocessing bus called the 
Advanced Processor Bus (AP-Bus) Interconnects 
80960MC system modules. The AP-Bus is synchro- 
nous, in that all components in the system, including 
processors and BXUs, are driven by the same clock 
edge. It is a 32-bit multiplexed bus with a maximum 
bandwidth of 43 Mbytes per second at 16 MHz. 

Transactions over the AP-Bus are encoded into 
pairs of request and reply packets. A request packet 
defines the operation, amount of data, and the loca- 
tion (or address) where the transaction will occur. In 
the case of a write request, the packet will also in- 
clude data. The reply packet indicates whether or 
not the action completed successfully, and in the 
case of read replies, will also include the requested 
data. Table 2 lists the various types of AP-Bus oper- 
ations. 

The AP-Bus supports a pipelining feature that allows 
up to three requests to be pending at any time. Re- 
ply packets are returned in the order requested un- 
less deferred, but requests and replies may be inter- 
mixed. For example, two requests may be made, fol- 
lowed by a single reply packet, then another request 
packet, before being completed by two reply pack- 
ets. 

The AP-Bus consists of 47 bi-directional signals, a 
clock signal, a RESET signal, and five module sup- 
port signals which are used to interface system mod- 
ules to the AP-Bus (see Figure 3). The BXU is the 
only component that attaches to the AP-Bus. 


BXUs connect to each other in the form of a matrix 
to allow orderly growth in the system by the addition 
of buses or modules. An 80960MC multiprocessing 
system allows up to 32 modules and four AP-Buses. 
In practice, the number of modules in a system will 
be somewhat less In order to meet the AP-Bus’s 
timing and electrical specifications; a practical limit 
may be 20 to 25 connections to an AP-Bus. Table 3 
contains a summary of the functions of the AP-Bus 
Interface Registers. 


Table 2. Types of AP-Bus Operations 


Packet 

Type 

Base 

Action 

Specific 

Operation 

Request 

Write 

Write Word(s) 

RMW Write Word(s) 

Read 

Read Word(s) 

RMW Read Word(s) 

Reply 

Accepted 

Read Reply Word(s) 

Acknowledge 
(Write Reply) 

Refused 

Reissue 

Not Acknowledged 
(NACK) 

Bad Access 
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Transaction Control 

• Arbitration: ARB (3..0) 

• Reply Ordering: RPYDEF 

Packet Signals 

• Specification: SPEC (5..0) 

• Address/Data: AD (31. .0) 

Error Signal Group 

• Check Signal: CHK (1..0) 

• Bus Error: (1..0) 




TRANSACTION CONTROL (5 LINES) 


PACKET SIGNALS (38 LINES) 


ERROR SIGNAL GROUP (4 LINES) 


SYNCHRONIZATION (2 LINES) 


MODULE SUPPORT (7 LINES) 




271082-3 


Synchronization and Initialization Group 

• System Clock: CLK2 

• Initialization: RESET 

Module Support Group 

• Identification: INITID 

• Module Check: MODCHK 

• Bus Output Control: BOUT 

• Communication: COM 

• Voltage Reference: Vref 

• Pop Queue: POPQUE 

• Subsystem Busy: SSBUSY 


Figure 3. Advanced Processor Bus 
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Table 3. AP-Bus Interface Registers 


Register 

Description 

Physical ID 

This register contains a unique identifier for a specific BXU (or FRC pair of 

BXUs) on an AP-Bus. 

Logical ID 

This register holds the logical ID for the BXU. In every case, all BXUs in the 
same module will share the same logical ID. When two modules are married 
in a QMR configuration, they will also share the same logical ID. 

Component 

Specifier 

The contents of this read-only register are fixed at manufacture and specify 
the type and stepping of the component. 

Arbitration ID 

When the BXU needs to Issue a request on the AP-Bus, it must actively 
arbitrate for the bus. The time and order in which a BXU arbitrates is 
determined by the contents of this write-only register. 

Com 

This register is used for loading external information, such as the type of 
board the BXU resides on, into the BXU. The register is useful for both 
initialization and diagnostics. 

AP-Bus Control 

This register is the general control and status register for the BXU’s AP-Bus 
interface. 

FT1 

Most of the BXU fault-tolerant capabilities can be selectively enabled by 
altering control bits in this register. 

Maxtime 

The value in this register determines the length of time that BXUs will remain 
quiescent following the beginning of an error report. 

FRC Splitting 
Control 

Writing to this register allows a master/checker pair of BXUs to be split Into 
separately functioning components. 

FRC Register 

The contents of this register determine of a BXU is part of a master/checker 
pair and how the component responds if it is part of a QMR module. 

Test Detection 

Bits in this register enable parity logic and other Internal self testing diagnostic 
features. 

AP Match 

Bits in this register are compared against the corresponding bits in the AP- 
Bus address cycle and determine which partition of the address space is 
recognized by this BXU. 

AP Mask 

If a bit In this register is cleared, it will cause the corresponding bit position in 
the Address Match register to be Ignored during comparisons. 


Memory addressing over the AP-Bus is divided into 
1 6-byte blocks. The location of a bus transaction is 
defined by a 32-bit address. Each address points to 
a single byte that is part of a larger 16-byte block. All 
transactions are performed on a single block or por- 
tion of a block, and do not overlap multiple blocks. 


Modes of Operation 

The BXU operates In either Processor or Memory 
mode. Processor mode provides support for Active 
or Active/Passive modules, while Memory mode 
supports Passive modules. The functions of several 
BXU signals are dependent on the operating mode 
of the BXU. 


In Processor mode, the BXU supports cache, I/O 
prefetch and lAC message functions. The BXU can 
act as either a master or slave on the L-Bus and 
requests can flow in either direction between the 
AP-Bus and the L-Bus. The assumption Is, however, 
that most traffic will flow from the L-Bus out onto the 
AP-Bus. In a processor-only module, there is no 
need for the BXU to participate In arbitration for the 
L-Bus, since it will operate only as a slave. 

In Memory mode, the BXU always operates as a 
master on the L-Bus and no requests are ever ac- 
cepted from the L-Bus. All requests flow from the 
AP-Bus into the module. In this mode, the BXU sup- 
ports memory functions and signaling, but does not 
provide caching or I/O prefetch. 
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Read-Modify-Write Transactions 

Read-Modify-Write (RMW) operations are provided 
to give BXUs the ability to read and modify a location 
as a single Indivisible action. A RMW-Read opera- 
tion Initiates the indivisible action by asserting the 
LOCK signal on the L-bus. A RMW-Write operation 
is used to terminate the action. 

When an RMW-Read transaction occurs, the block 
of memory addressed Is marked by the BXU control- 
ling that portion of memory as locked (the lock cov- 
ers a fixed address space based on address bits 4 
and 6). Once locked, any other RMW-Reads to this 
block will be rejected, but the block remains avail- 
able for other types of memory operations. 

When an RMW-Read is issued, the BXU controlling 
the affected memory will either respond with data in 
a normal Read Reply (and set the appropriate lock), 
or it will respond with a Reissue Reply Indicating that 
the requested block is already locked. If refused, the 
requesting BXU will wait a short interval and then put 
the RMW-Read request back into the arbitration pro- 
cess and try again. 

RMW-Writes are equivalent to Write Word(s) except 
that it resets the lock for that memory location. The 
only valid reply packet is the Ack (Write Reply). 


Interagent Communications (lAC) 
Support 

Bus Extension Units and 80960MC processors com- 
municate by sending Interagent Communication 
(lAC) messages, which are a set of memory-mapped 
addresses recognized by all BXUs. These messages 
are used for such system functions as Initialization, 
cache flushing, access to error logs and interrupts. 
The upper 1 6 Mbytes of the 80960MC’s 4 Gigabyte 
address range are reserved for lAC communica- 
tions. 


lAC requests fall into two major groups; messages 
and register requests. Messages are sent between 
processors to cause a processor to perform a spe- 
cific action (e.g., start, stop, flush cache, etc.) and 
are held In the lAC message support registers; Table 
4 summarizes the function of these four registers. 
Register requests are used by software to read and 
write to BXU registers In order to control the system 
operation or configuration. 

An lAC message always originates on an L-Bus and 
usually from a processor. From the originator, the 
request flows to the BXU where it may be handled 
internally or propagated on to the AP-Bus. If the lAC 
is sent on to the AP-Bus, the final destination of the 
lAC (another BXU) must reside on that bus. The lAC 
will not be propagated onto another L-Bus or AP- 
Bus. lAC messages can be one to four words long. 

Although each L-Bus (processor or memory module) 
may be connected to as many as four AP-Buses, at 
any point in time only one bus will be designated as 
the message bus. All lAC messages will flow over 
that bus. The BXUs on the message bus are respon- 
sible for handling the lAC message traffic on behalf 
of the processors residing on their L-Bus (an L-Bus 
may support one or two processors). 

AP-Bus 0 normally serves as the message bus. If 
AP-Bus 0 is not functional, then AP-Bus 1 serves as 
the message bus, completely transparent to the 
software. Processors are unaware of which bus is 
actually acting as the message bus. 


I/O Prefetch Support 

The BXU offers two I/O prefetch channels to pro- 
vide high bandwidth, low latency access to memory 
for sequential transfers. Each channel buffers 32 
bytes of data in two 1 6-byte blocks. As data is re- 
quested from the buffers, the BXU automatically pre- 
fetches the next data block. The BXU can take 


Table 4. 1 AC Support Registers 


Register 

Description 

Processor 0 Priority 

This register holds the priority of the task (process) which Processor 0 on the 
BXU’s L-Bus is currently executing. 

Processor 0 Message 

This register buffers four words of data from an lAC message for Processor 0. 

Processor 1 Priority 

This register holds the priority of the task (process) which Processor 1 on the 
BXU’s L-Bus is currently executing. 

Processor 1 Message 

This register buffers four words of data from an lAC message for Processor 1 . 
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advantage of the three-deep AP-Bus pipeline to 
quickly fill the buffers if it ever gets behind because 
of momentary surges in AP-Bus traffic. In this way, 
the prefetch logic acts to provide stable, bounded 
response times, even in large multiprocessor config- 
urations. 

Because the normal operation of the BXU hides the 
latency of write requests by replying immediately on 
the L-Bus, the prefetch unit operates only for read 
requests. On a read request from the L-Bus, the pre- 
fetch logic returns the amount of data requested. 
Any processor or intelligent device used with the 
BXU must guarantee that it will split all memory re- 
quests that cross 16-byte boundaries into two re- 
quests. 


Cache Support 

The main function of a cache is to provide local high 
speed storage for frequently accessed memory lo- 
cations. Storing the information locally, the cache 
Intercepts memory references and handles them di- 
rectly without transferring the request to the AP-Bus. 
This action results In lower traffic on the AP-Bus and 
decreased latency on the L-Bus, leading to im- 


proved performance for a processor on the L-Bus. It 
also increases potential system performance in a 
multiprocessor system by reducing each processor’s 
demand for AP-Bus bandwidth, thereby allowing 
more processors in a system. 

The BXU provides cache directory, coherency logic, 
and control signals, while external SRAM is used for 
data storage. A CACHE signal output from the 
80960MC processor indicates to the BXU whether a 
request is cacheable. The operation of the BXU 
cache is not dependent on the size of the data trans- 
fer and therefore can support partial writes. Both 
data and instructions can be contained within the 
local cache. 


The BXU supports a two-way, set associative cache 
with 64 sets. The (read address) tag field is 20 bits 
long and consists of LAD lines 31-12. There are 
eight bits that indicate if a line is valid (a line is 16 
bytes). The control bits In the cache control registers 
can be used to mask some of these bits to change 
cache configurations. All entries in the directory can 
be invalidated by sending an INVALIDATE CACHE 
Command to each BXU In the module. Figure 4 
shows one example of a BXU cache directory and 
its relation to L-Bus addresses. 



AP-BUS ADDRESS 

LAD 3 ,-UDi 2 LAD 12 -LJ^D 7 LAD 6 -LAD 4 UDj-LADq 


WAYn 


WAYt 


TAG 


TAG 


STORED ADDRESSES 

STORED ADDRESSES 










■■■■■I 


LINE SELECT 




SET 0 
SET 1 
SET 2 


SET 62 
SET 63 


WORD SELECT 


I ENCODER ] 

WAY BIT I 


CACHE ADDRESS 


Figure 4. Example of a Cache Directory Array 
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A single BXU supports 16 Kbytes of cache. When a 
processor module uses multiple BXUs (and there- 
fore multiple buses), the BXUs cooperate to provide 
a larger directory and addressing for a larger cache. 
The best way to view this larger directory is to think 
of it as having an increased number of sets. Thus a 
cache managed by two BXUs will have a directory 
consisting of 128 sets instead of 64. The maximum 
size cache is 64 Kbytes (four BXUs supporting four 
AP-Buses per processor module). 

The cache is managed using a write-through policy 
that guarantees that the shared system memory will 
always have the most recent copy of all data; BXU 
caches never contain the only copy of revised data. 
Any time a processor updates a cache entry, it al- 
ways causes a write request on the AP-Bus, so that 
there are never any hidden updates. In addition, ail 
BXUs monitor AP-Bus traffic to detect if an update is 
being made to a location which they are storing In 


their own cache. If so, that line in the cache directory 
is marked invalid. This procedure guarantees that a 
BXU cache will always return correct data even 
when a system uses multiple caches, when multiple 
processors treat a single data Item differently (some 
caching, some not), or when two processors are 
used on a single L-Bus. 

An example of an SRAM control design using a sin- 
gle BXU Is shown In Figure 5. The BXU supplies six 
memory control signals to interface the directory and 
control logic with an external cache composed of 
static R AM: Cache Re ad (C R), Cache Write (CW), 
WayO ( WYQ) , Wayl (WY1), WordO (WDO), and 
Wordi (WD1). SRAM control also requires use of 
the L-Bus byte enable (BE3-BE0) signals and cer- 
tain address lines. To simplif y latchin g the byte en- 
able signals, the BXU asserts READY on all address 
and recovery cycles as well as when it is transferring 
data. 
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The tight timing specifications of SRAMs require a the processor made a read request for two bytes 
small amount of external logic to interface a static that missed the cache, the BXU would first write the 

RAM cache to a BXU. Since all BXU cache signals entire 16-byte block, then return the requested infor- 

have a relatively wide clock to data valid specifica- mation to the processor. If the byte enable latqhes 

tion (Ted), external flip-flops are used to achieve weren’t set, then the write into the cache wouldn’t 

tighter resolution of the Cache Write and Word edg- work correctly because not all byte enables would 

es. The address bits are latched using ALE from the be asserted. Byte enable information does not need 

processor. WayO selects between the two “ways” in to be held on reads because data Is always returned 

the cache directory, and Wayl selects between the in full words and the processor selects the portion of 

cache and private memory (if present on the L-Bus). the word that it needs Internally. Signal timings are 

shown in Figures 6-10. 

In order to ensure that the cache is filled properly, 
the byte enable latch is cleared on read requests. If 



Figure 6. Cache Read Signal Timing for 35 ns SRAMs 
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Figure 10. Cache Signal Timing for a 4- Word Read with a Cache Fill for 35 ns SRAMs 
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The BXU has four memory address recognizers for 
the L-Bus plus an additional recognizer for initializa- 
tion RAM. Three of the memory address recognizers 
(Mask2-0 and Match2-0) map to shared system 
memory, while the fourth address recognizer maps 
requests to SRAM on the local bus, called private 
memory. The INIT-RAM recognizer serves two func- 
tions: it enables bootstrap software to use the 
SRAM cache as a scratch pad during system initiali- 
zation, and it provides the means for executing a 
memory test on the SRAM cache. The private mem- 
ory recognizer allows SRAM to be used on the local 
bus as normal memory in addition to a cache. Pri- 
vate memory is not accessable by other modules on 
the AP-Bus. 


Memory Module Support 

When operating in Memory mode, the BXU Is a Lo- 
cal Bus master and only handles requests Inbound 
from the AP-Bus. The cache control logic is disabled 
since it Is unnecessary in a memory module. 

A read request received by an idle BXU will be seen 
on the L-Bus 1 .5 clock cycles after it was received 
on the AP-Bus. BXUs offer two reply speed options 
for inbound Read requests. The high-performance 
option, called the “fast reply” mode, allows data to 
flow onto the AP-Bus with only a half-cycle delay 
through the BXU. This option requires the L-Bus 
memory controller to be able to supply data on every 
clock cycle. In the “slow reply” mode, the BXU buff- 
ers the entire AP-Bus reply packet before sending it 
onto the AP-Bus. This option permits the use of 
slower, less costly memory. 

Write requests are fully buffered before being 
passed to the L-Bus. Once the BXU has received an 
error-free packet, it initiates the L-Bus transaction. 
When the last data word has been accepted on the 
L-Bus, the BXU generates a reply on the AP-Bus. 

In memory mode, the BXU provides two or four 
Ready-Modify-Write locks with timeouts. Four locks 
are available If the module is not Interleaved with 
other modules, two locks if it is interleaved. When 
interleaving occurs, address bit 4 is used as part of 
the address recognition for the module, which there- 
by restricts a module to use either locks 0 and 2, or 
1 and 3. This approach ensures that If a bus switch 
occurs, the locks that may have been allocated on 
the failed bus will not overlap with locks that are 
currently allocated on the surviving bus (since all 
traffic Is rerouted to the surviving bus). 


FAULT TOLERANCE 

Three basic tenets form the basis for the Implemen- 
tation of 80960MC fault tolerant systems. First, 


fault tolerant functions are achieved through the rep- 
lication of VLSI components. Second, the system is 
partitioned into a set of confinement areas which 
form the basis of error detection and recovery. Third, 
only bus-oriented communication paths are used to 
provide system communication. 


The BXU is unique in that it provides all the functions 
necessary to detect, isolate, and recover from a fail- 
ure in any single system module or AP-Bus. Unlike 
many other fault tolerant system designs, 80960MC 
systems do not rely on voter components for fault 
detection, thereby eliminating one potential source 
of single-point failures. Although the BXU registers 
must be initialized by software, all the fault tolerant 
mechanisms are built into the hardware, and correct 
fault recovery of a system built using the BXU does 
not depend on software intervention. 


The purpose of a confinement area is to inhibit dam- 
age from error propagation and to isolate the faulty 
area for subsequent recovery and repair. A confine- 
ment area is defined as a unit (system module or 
AP-Bus) that has a limited number of tightly con- 
trolled interfaces. Figure 1 1 shows the confinement 
areas within a small system. Detection mechanisms 
exist at every interface to ensure that no inconsist- 
ent data can leave the confinement area and corrupt 
other confinement areas. When a fault occurs in the 
system, it is Immediately isolated to a confinement 
area. The fault is known to be in that confinement 
area, and all other confinement areas are known to 
be fault-free. All intermodule communication in an 
80960MC system occurs over buses. There are no 
point-to-point or daisy-chained signals. 



This arrangement makes modular growth and on- 
line repair possible since no signal definition is de- 
pendent on the number of resources in the system. 
The presence or absence of any module cannot pre- 
vent communication between any other modules. 
The AP-Bus provides a uniform communications 
matrix that allows multiprocessor and fault-tolerant 
systems to expand modularly. 


In 80960MC systems, there are three distinct steps 
in responding to an error. First, the error is detected 
and isolated to a confinement area. Next, the error is 
reported to all the modules In the system. This ac- 
tion prevents the incorrect data from propagating 
into another confinement area and provides all the 
modules with the Information required to perform re- 
covery. Finally, the faulty confinement area is isolat- 
ed from the system. Recovery occurs through the 
application of redundant resources available in the 
system. Table 5 describes the fault-tolerant control 
registers. 
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Figure 1 1. Fault Confinement Areas in an 80960MC System 


Table 5. Fault Tolerance Support Registers and Commands 


Register 

Description 

Test Type 

The Test Report command instructs the BXU to test the error reporting 
network. The type of error report generated is determined by the content of 
this register. 

Spouse ID 

In a QMR module, this register holds the module ID of the FRC module to 
which this module is married. 

QMR 

The contents of this register determine if a module is part of a QMR pair, and 
if it should function as the primary or shadow in the pair. 

Module Error ID 

Identifies the BXU as part of a specific module confinement area. 

Bus Error ID 

Determines the Bus ID contents in an error report. 

Error Log 

Records the type of the most recent error report received and the number of 
errors that have occurred since the last Terminate Permanent Error Window 
command. 

Error Record 

Holds the contents of the previous error report. 

FT2 

Holds additional fault-tolerant control parameters. 

Test Report Command 

The Test Report command Instructs the BXU to test the error reporting 
network. The type of error report generated Is determined by the contents of 
the T est T ype Register. 

Primary Catastrophe 
Command 

A write to this register causes a Primary Catastrophe error report, usually 
indicating a primary module power failure. 

Shadow Catastrophe 
Command 

A write to this register causes a Shadow Catastrophe error report, usually 
indicating a shadow module power failure. 

Terminate Permanent 
Error Window 

Command 

A write to this register closes the permanent error window, so that a 
reoccurance of a previous error is not recorded as permanent. 

Attach Bus Command 

A write to this register causes the identified bus to be attached to the system 
and become active. 

Detach Bus Command 

A write to this register causes the identified bus to be detached from the 
system and become inactive. 

Sync Refresh Command 

A write to this register causes BXUs in memory mode to assert their ForceRef 
pin and enables AP-Bus address matching. 
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Functional Redundancy Checking 

BXU components can be paired together to com- 
pare their outputs to ensure that they agree. This 
detection mechanism is called Functional Redun- 
dancy Checking (FRC) because identical compo- 
nents are used to check operations. 

At initialization time, one component in the BXU pair 
is selected to be the “Master”, while the other is 
designated the “Checker”. The Master BXU is re- 
sponsible for carrying out the normal operation of 
the system and behaves as it would if it were operat- 
ing in a non-fault tolerant system. The Checker BXU, 
in contrast, disables its AP-Bus outputs and instead 
monitors the AP-Bus pins of the Master (see Figure 
12). The Checker BXU is responsible for duplicating 
the operation of the Master and using its internal 
comparison circuitry to detect any inconsistency be- 
tween Its result and the output of the Master. 

The Master and Checker BXUs run in lock step, 
comparing operations cycle-by-cycle. If at any point 
the Master or Checker disagree, an FRC error will be 
signaled and an error reporting cycle will begin. 

When using the FRC mechanism, the BXU pins 
comprising the electrical connection to the AP-Bus 
must be connected together. A BXU provides FRC 
coverage on the AD, SPEC, BOUT and MODCHK 
pins. 


Failures In the Checker’s AP-Bus drivers can be de- 
tected by reversing the role of the Master and 
Checker BXU. When Master/Checker Toggling Is 
enabled, the Roles of the Master and Checker are 
switched after each bus cycle. 


Parity, Duplication and Timeouts 

In order to prevent incorrect AP-Bus operation for 
passing corrupted data to the BXU (and onto the 
Local Bus), the BXU uses parity, signal duplication, 
and bus timeouts to check for errors. Specifically, 
the AP-Bus has interlaced parity bits covering the 
AD and SPEC signals, signal duplication is used on 
both arbitration and RPYDEF, and a bus timer Is set 
to monitor the bus for non-response to a request. 


The BXU calculates two separate parity bits across 
alternate AD and SPEC signals, which are indicated 
by the CHKO and CHK1 pins. CHKO is even parity 
across the even AD and SPEC pins, and CHK1 Is 
even parity across the odd pins. Since the arbitration 
and RPYDEF lines are driven independently by mul- 
tiple bus agents (BXUs), parity cannot be used for 
error detection, rather the detection of errors is done 
by duplicating each set of lines, one set for Masters, 
the other set for Checkers. Consequently, each BXU 
connects to only one arbitration network. If there Is a 
disagreement between the two sets of signals on 
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Figure 12. Functional Redundancy Checking (FRC) 
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the AP-Bus, it will be detected through an FRC dis- 
agreement. The BXU uses a timer to determine if no 
response has been received and too long a period 
has elapsed since the bus request was made. Dur- 
ing normal operation the timer is active whenever 
the bus pipeline is not empty. The timer is reset on 
every bus reply or deferral. If the BXU was the 
source of the requests and a timeout occurs, it sig- 
nals a Bad Access Reply on the AP-Bus. The timer 
is nominally 64 clocks. 


ceives an error report, but is guaranteed to receive 
the same error report. Each BXU in the system uni- 
formly logs each error report, and is able to use this 
information to proceed independently with the ap- 
propriate recovery procedure. 

The BXU has two serial Error Reporting Lines asso- 
ciated with each bus interface (BERLs for the AP- 
Bus and LERLs for the Local Bus). An indentical se- 
rial error report is sent over each pair of lines associ- 
ated with each bus. 


Error Reporting 

The error reporting network is the backbone of fault 
isolation and recovery. When an error is detected, 
the BXU detecting the error reports its type and lo- 
cation to all other nodes in the system. The error 
reporting network Is designed so that, independent 
of an error In the system, each node not only re- 


An AP-Bus error reporting cycle consists of five 
phases: Reporting, Partner Communications, Tran- 
sient Waiting Period, Retry, and the Permanent Error 
Window (see Figure 13). The reporting phase lasts 
256 cycles from the beginning of the first report re- 
ceived on the BXU’s error reporting lines. The BXU 
becomes quiescent as soon as it detects the start bit 
of an error report and remains quiescent through the 
Transient Waiting Period. 



3-292 





M82965 




iny. 


During partner communications, BXUs communicate 
with each other via their POPQUE lines to determine 
whether to retry accesses in the case that one of the 
AP-Buses is removed from the system. Partner or- 
dering lasts 256 cycles. 

Transient waiting enables the system to sustain dis- 
turbances from mechanical vibrations and brief elec- 
trical transients without needing to permanently re- 
configure the system. The BXUs simply wait a pre- 
determined time for the transient to subside. The du- 
ration of the Transient Waiting Period is adjustable 
and can be set by software (16 jits to 500 ms at 
16 MHz). During this period, the BXU completes its 
internal recovery mechanisms (If the error is perma- 
nent). Since the transient waiting mechanism on the 
buses depends on all buses moving to the retry 
state at the same time, all BXUs must have identical 
values for the Transient Waiting Period. 


During the RETRY phase, all accesses that were 
pending at the time that the error report was re- 
ceived will be retried. At the same time as RETRY 
begins, the BXU enters the Permanent Error Win- 
dow. During this Interval, the BXU watches for the 
error to reoccur. 

Each BXU has two registers that are used for log- 
ging error reports. The ERROR LOG register con- 
tains the current error report and the ERROR REC- 
ORD register contains the previous error report. 
When a error report is received, the contents of the 
ERROR LOG register are copied Into the ERROR 
RECORD register. Both registers are accessible by 
software and are the primary means by which the 
software routines responsible for system manage- 
ment communicate with the hardware fault handling 
mechanisms. Table 6 lists the types of errors that 
can be reported. 


Table 6. Error Types Reported 


Error Type 

Description 

Unsafe Confinement 
Area 

This type of report Is issued when an error Is detected that would make a retry 
dangerous. 

Primary Catastophe 

Generated in response to a Primary Catastrophe Command from software. 

The command is usually issued when ail primary modules are about to fail 
because of a loss of power. 

Shadow Catastophe 

Generated in response to a Shadow Catastrophe Command from software. 

The command is usually issued when all shadow modules are about to fail 
because of a loss of power. 

Error Reporting Error 

The report indicates that a BXU has detected a failure on one of its error 
reporting lines. 

Bus Arbitration 

This report is issued when an FRC error is detected on the BOUT pin of the 

BXU Indicating a bus arbitration error. 

Bus Parity 

Indicates that a parity error has been detected on the AP-Bus. 

Component 

Indicates that a checker has detected an FRC error while its master was 
driving the AP-Bus. 

Uncorrectable Array ' 
Error 

An uncorrectable error has been detected in one of the memory arrays. 

Correctable ECC 

A correctable error has been detected in one of the memory arrays. 

COM Altered 

This error report occurs when the COM input Is toggled (two cycles high, 
followed by two cycles low) and may be used by external circuits to notify the 
system of an external fault. 

Attach Bus 

Issued in response to an Attach Bus command, this report is used to 
reactivate a bus that was previously out of service. 

Detach Bus 

Issued in response to a Detach Bus command, this report is used to remove a 
faulty bus from the system. 

Terminate Permanent 
Error Window 

Receiving this report signifies the end of the Permanent Error Window. 

Sync Refresh 

Used to synchronize memory modules that are being married to form a 
Primary/Shadow Pair. 
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The BXU’s hardware compares the contents of the 
two error reporting registers to determine if a bus 
retry has resulted in a repeat of the previous error 
(which therefore must be considered a permanent 
error). Software can clear the two registers by send- 
ing a Terminate Permanent Error Window command. 
The registers allow software to monitor the health of 
the system and to respond appropriately In case of 
hardware problems. The availability of this Informa- 
tion simplifies diagnostic routines. 

The ERROR LOG register is handled independently 
by hardware and software; hardware always re- 
sponds immediately to an error report so that it Is 
never lost by failure of software to respond. During 
normal system operation, software should never 
write to this register, since it is both read and written 
by hardware. The ERROR LOG register is cleared 
on a cold start, but its contents are retained across a 
warm start. 


RECOVERY MECHANISMS 


Module Shadowing 

Automatic recovery from permanent single-point fail- 
ures in a module is accomplished through module 
shadowing, or what is more formally called Quad 
Modular Redundancy (OMR). Using this technique, 
two FRO pairs (master/checker) of the same type 
are logically linked to form a primary/shadow pair 
(see Figure 14). The marriage of the two modules is 
performed by software which sets the logical ID of 
the two modules equal and restarts them In lock 
step (or synchronous operation). There is no direct 
electrical connection between a primary/shadow 
pair. They are usually on separate boards so that 
either can be removed in the case of a failure in that 
module. 


PRIMARY SHADOW 
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Figure 14. In Quad Modular Redundancy (QMR), Self-Checking Modules are Paired 
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The primary/shadow pair operate in lock step so 
that there is always a complete and current backup 
for an FRC pair. At any point in time, one FRC pair 
will be active (i.e., sending its output to the AP-Bus) 
while the other will be passive (i.e., its outputs will be 
disabled). Initially, the primary FRC pair is active and 
is responsible for issuing requests or replies to the 
AP-Bus. Data leaves only by means of the active 
FRC pair. 

As an option, the roles of active and passive mod- 
ules are switched after every second bus cycle. (In 
contrast, master/checker pairs are toggled every cy- 
cle). This ping-pong action exercises all of the logic 
In both primary and shadow modules. Any latent fail- 
ure that exists in the AP-Bus drivers will be detected 
immediately. All of the logic to perform this lock step 
operation is contained In the BXU and neither the 
processors nor any discrete logic contained in a 
module is aware that the module is participating as 
one-half of a primary/shadow pair. 

Each physical FRC pair (primary and shadow) re- 
mains a self-checking pair. Whether in an active or 
passive module, all detection mechanisms remain 


enabled and continuously check the operation of 
that module. Neither the primary nor the shadow 
check the operation of the other; FRC is used for 
fault detection, while module shadowing (Quad Mod- 
ular Redundancy) Is used to ensure immediate re- 
covery. 


Automatic Module Recovery 

If a permanent error is detected in either a primary or 
a shadow FRC pair, the faulty pair will immediately 
be disabled as all BXUs in the pair shutdown. The 
surviving spouse then separates itself from the faulty 
FRC pair and operates as an active pair on every 
bus cycle. At that point, recovery is complete. 

Hardware recovery is autonomous and requires no 
software intervention to complete. The operating 
system can be informed that a hardware reconfigu- 
ration has taken place by tying an error report line to 
one of the processor’s interrupt pins. Then when a 
fault occurs, a processor can examine the error re- 
port log to discover what has happened and then re- 
examine the system configuration. Figure 1 5 shows 
an example of module recovery. 



Figure 15. Faulty Modules are Automatically Disabled 
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Bus Switching 

All AP-Buses in an 80960MC system are physically 
identical, but when a system is operational each bus 
handles a unique address range. The BXU has been 
designed so that it is possible to pair together two 
AP-Busses and have them act as redundant or alter- 
nate resources for each other. AP-Bus 0 Is paired 
with AP-Bus 1 and AP-Bus 2 is paired with AP-Bus 3. 
In order for an FRC pair to have an additional bus, it 
must also have another pair of Master/Checker 
BXUs. Normally the memory addresses will be inter- 
leaved between the two (or four) buses, but this isn’t 
necessary for bus switching. 

Since the AP-Bus does not hold state information 
(as do processors and memory), all buses In the sys- 


tem may be used during normal operation. There is 
no degradation of throughput to achieve bus. redun- 
dancy. Each bus Is fully operational. 

When a permanent error has been detected on an 
AP-Bus, all BXUs on the faulty bus disable them- 
selves. L-Bus requests for the failed bus will be ig- 
nored by the disabled BXUs and picked up instead 
by the BXUs attached to the backup bus. If a BXU 
has a cache, the BXU invalidates its cache directory 
since the directory must be reorganized to match the 
new (and larger) address space, including a new in- 
terleaving factor. Figure 16 shows an example of 
bus switching. 


PRIMARY SHADOW PRIMARY SHADOW 



Legend: 

C = CPU 
B = BXU 

M = Memory Array 

Hardware automatically reconfigures to bypass the faulty bus (AP-Busq). 
AP-Busi takes over the address space of AP-Busq. 


Figure 16. If a Bus Fails, Its Backup Bus Takes Over Immediately 
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Self-Healing Systems 


BXU Registers 


In some applications it is important to guarantee the 
integrity of the data, but momentary interruptions in 
processing can occur without seriously affecting op- 
erations or jeopardizing human lives. For these ap- 
plications, a cost effective approach may be to use 
self-healing systems. 

Self-healing systems use Functional Redundancy 
Checking to ensure that all errors are detected and 
that faults are confined within a module. Fault recov- 
ery is not automatic; recovery and reconfiguration is 
done by software following error detection. Self- 
healing systems are less costly than fully fault-toler- 
ant systems because fewer components are neces- 
sary. 

Self-healing systems do not operate continuously in 
the case of a hardware failure. Program execution 
cannot proceed after detection of a permanent error 
until the system has been reconfigured. Transient 
errors will still be taken care of by the hardware 
components. Upon detection of a permanent error, 
the system will cease operation, however FRC en- 
sures that no data will have been corrupted. 

After the system stops, it must be reset and a diag- 
nostic program run which reads the BXU errors logs 
and determines the most appropriate action to take. 
Recovery and reconfiguration may be complete and 
the system back on-line within a few seconds to sev- 
eral minutes, depending on the nature of the fault. 


Initialization and control of the BXU is done by read- 
ing and writing the BXU’s internal registers. The reg- 
isters are mapped to the upper 16 Mbytes of the 
80960MC processor’s physical address space. 


Initialization of a system using BXUs occurs in three 
stages. In the first stage which immediately follows 
RESET, all registers (except for the registers con- 
taining error report information) are loaded with 0 or 
with values sampled off a set of pins. 

During this stage the BXU’s System Bus ID and 
mode of operation are established. In the second 
stage, software assigns logical, physical, and arbitra- 
tion IDs to each BXU. Then in the third stage, the 
COM pin can be used to load board-specific infor- 
mation into the BXU and software can change the 
default values of any of the registers. 


Once software has established the initial configura- 
tion of the system, no further interaction between 
the system software and the BXU may be necessary 
except for testing the error reporting functions and 
for making on-line changes to the system’s initial 
configuration. 



This Advance Informatipn Data Sheet contains a 
functional description for each of the BXU’s major 
register groups. For more specific details on control- 
ling each of the registers, please consult the 
80960MC Hardware Designer’s Reference Manual. 


Self-healing systems are not appropriate for real- 
time applications where program delays longer than 
a few milliseconds cannot be tolerated. In these crit- 
ical applications, an interruption in system operation 
might result in damage to expensive material and 
equipment, or endangerment of human lives. The 
80960MC system fault tolerant architecture provides 
the means for building systems that will recover au- 
tomatically within 48 fxs. 


SIGNAL DESCRIPTIONS 

Tables 7 through 1 1 describe the function of each of 
the BXU signals. Many of the pins are multiplexed 
and have different interpretations depending on 
whether the BXU Is in Processor or Memory mode. 
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Table 7. M82965 BXU L-Bus Signals 


Symbol 

Type 

Name and Function 

LAD 31 

-LADo 

I/O 

T.S. 

LOCAL ADDRESS/DATA BUS: Carries 32-bit physical addresses and data to and from a 
processor or memory. During an address (Tg) cycle, bits 2-31 contain a physical word 
addres (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 contain 
read or write data. The LAD lines are active HIGH and float to a three-state OFF when the 
bus is not acquired. 

SiZE: Which is comprised of bits 0-1 of the LAD bus during a Tg cycle, specifies the size 
of a transfer in words. 

LADi LADo 

0 0 1 Word 

0 1 2 Words 

1 0 3 Words 

1 1 4 Words 

ME 

0 

T.S. 

ADDRESS-LATCH ENABLE: Indicates the transfer of a physical address. ALE is 
asserted during a Tg cycle, and deasserted during Tj cycles and the second half of Tg 
cycles. It is active LOW and floats to a three-state OFF when the L-Bus is not acquired. 

ADS 

I/O 

O.D. 

ADDRESS STATUS: Is used to detect address cycles and additional data cycles. 

CACHE 

I 

CACHEABLE: During a Tg cycle, specifies whether data is cacheable. When operating 
in the MEMORY mode this pin shouid be tied to ground through a 10 kn resistor. 

W/R 

I/O 

O.D. 

WRiTE/READ: specifies, during a Tg cycle, whether the operation is a write or read. It Is 
latched on-chip and remains valid during Td cycles. 

CW/DEN 

0 

O.D. 

CACHE WRiTE: (Defined only when the BXU is in PROCESSOR mode). This signal 
indicates that the cache SRAM should be written with data from the L-Bus and is used to 
generate the chip select, and write enable signals required by the SRAM. The signal is 
open drain so it can be shared among multiple BXUs controlling a single set of SRAMs. 
DATA ENABLE: (Defined only when the BXU is in MEMORY mode). Is asserted during 

Td cycles and indicates transfer of data on the local AD bus lines. 

OR/ 

DT/R 

0 

O.D. 

CACHE READ: (Defined only when the BXU is in PROCESSOR mode). This signal 
Indicates that the cache SRAM should drive data onto the L-Bus in response to a read 
request and is used to generate the chip select and output enable signals required by the 
SRAM. This signal Is open drain so It can be shared among multiple BXUs controlling a 
single cache. 

uf\ 1 M 1 riMiMoivii 1 /REwcii VC.: (Deiined only when ihe BaU is in MEMOn t mode). 

Indicates the direction of data transfer. It is low during Tg and Tjj cycles for a read or 
interrupt acknowledgement; It is high during Tg and Td cycles for a write. DT/R never 
changes state when DEN is asserted. 

LOCK 

I 

BUS LOCK: Is used by the BXU to distinguish between normal reads and RMW-reads, 
normal writes and RMW-wrItes. 

An 80960MC processor asserts LOCK at the beginning of an RMW cycle, and the BXU 
recognizes it as an RMW-read. If the read operation is accepted by the module serving 
memory, the processor drops LOCK, and executes an RMW-write. LOCK Is also held 
asserted during an Interrupt-acknowledge transaction. 

READY 

I/O 

O.D 

READY: Indicates that data on LAD lines can be sampled or removed. If READY Is not 
asserted during a Td cycle, the Td cycle is extended to the next cycle, and ADS Is not 
asserted in the next cycle. READY is driven on Tg, Tr and Tj cycles. 


NOTES: 

I/O = Input/Output, I = Input, O = Output, O.D. = Open Drain, T.S. = three-state 
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Table 7. M82965 BXU L-Bus Signals (Continued) 


Symbol 

Type 

Name and Function 

^3- 

BEo 

I/O 

O.D. 

BYTE ENABLES: Specify which data bytes on the local bus_will take part in 
the next bus cycle. BE 3 corresponds to LAD 24 -LAD 31 and BEq corresponds 
to LADq— LAD y. 

HOLD/ 

HOLDAR 

1 

HOLD: Indicates that a master I/O peripheral requests control of the bus. 

When the BXU receives HOLD and grants the peripheral control of the bus, it 
floats the bus lines and then asserts HLDA and enters the Th state. When 

HOLD is deasserted, the BXU will deassert HLDA and go to either the T-, or Ta 
state. 

HOLD ACKNOWLEDGE REQUEST: Is an input to the secondary bus master 
that the primary bus master has relinquished control of the bus. 

HLDA/ 

HOLDR 

0 

HOLD ACKNOWLEDGE: Relinquishes control of the bus to a master I/O 
peripheral. 

HOLD REQUST: Is used by a Secondary Bus Master to request use of the 
bus from the Primary Bus Master. 


Table 8. M82965 BXU L~Bus Module Support Signals 


Symbol 

Type 

Name and Function 

BADAC 

0 

O.D. 

BAD ACCESS: If asserted In the cycle following the one in which the last READY of a 
transaction is asserted as a result of a bad access, It indicates that the transaction has 
exceeded the AP-Bus time-out period. 

IACq/^ 

I/O 

O.D. 

INTERAGENT COMMUNICATION: PROCESSOR 0: (Defined only when the BXU is In 

PROCESSOR mode). Is an open-drain output that indicates that there is a pending lAC 
message for Processor 0 on the BXU’s local bus. 

EXTERNAL ERROR: (Defined only when the BXU is in MEMORY mode). Is an input that 
indicates that an error has been detected in external logic (e.g., a failure in a discrete 
memory controller). 

iACi/F^ 

0 

O.D. 

INTER AGENT COMMUNICATION: PROCESSOR 1: (Defined only when the BXU Is in 

PROCESSOR mode). Is an open-drain output that indicates that there Is a pending lAC 
message for Processor 1 on the BXU’s local bus. 

FORCE REFRESH: (Defined only when the BXU is In memory mode). Is an open-drain 
output that tells the external memory controller to Immediately execute a refresh 
operation. 

PFETCH 

I 

PREFETCH: Is used in conjunction with the Cache and Write/Read (W/R) signals to 
define the type of request being issued (0 = LO, 1 = HI): 

PFETCH CACHE W/R 

0 0 0 Read using Prefetch Channel 0 

0 0 1 Start for Prefetch Channel 0 

0 1 0 Read using Prefetch Channel 1 

0 1 1 Start for Prefetch Channel 1 

1 0 0 Noncacheable Read 

1 0 1 Noncacheable Write 

1 1 0 Cacheable Read 

1 1 1 Cacheable Write 


NOTES: 

I/O = Input/Output, I = Input, O = Output, O.D. = Open Drain 
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Table 9. M82965 BXU AP-Bus Signals 


Symbol 

Type 

Name and Function 

AD31 -ADq 

I/O 

SYSTEM ADDRESS/DATA LINES: Carry 32-bit addresses and data 


O.D. 

between modules (BXUs) on an AP-Bus. The content of the AD lines is 
defined by the SPEC encoding during the same bus cycle. 

SPEC5-SPEC0 

I/O 

PACKET SPECIFICATION: Signals define the packet type and the 


O.D. 

parameters required for the transaction: 

SPEC 5 : REQUEST: Is asserted if the packet is a request packet. 

SPEC 4 : MULTICYCLE: Is asserted if the packet consists of more than 
one bus cycle. 

SPEC 3 -SPEC 2 : CYCLE COUNT: These two bits are used in 
conjunction with Request and Multicycle signals to specify the length 
of the packet (in bus cycles) and the data length (in words). 
SPEC 1 -SPEC 0 : OPERATION/STATUS TYPE: These two bits identify 
the specific operation or status conveyed by the packet. 

CHK1-CHK0 

I/O 

CHECK SIGNALS: Provide interlaced parity for the SPEC and AD 


O.D. 

lines. 

ARB3-ARB0 

I/O 

ARBITRATION: Signals are used by the bus agents (BXUs) to 


O.D. 

determine which agent has access to the bus next. These signals have 
a timing that is one-half cycle out of phase with the AD lines. 

RPYDEF 

I/O 

REPLY DEFER: Signal allows an agent to give up its “slot” on the bus 


O.D. 

temporarily if Its access is going to take a long time. This action 
reorders the pipeline, moving the deferred request to the bottom of the 
queue, resets the bus time-out counter and permits another agent to 
use the bus. 

BERL1-BERL0 

I/O 

BUS ERROR REPORT LINES: Is used to signal errors from bus 


O.D. 

transactions or from within modules connected to the bus. 


NOTES: 

I/O = Input/Output, I = Input, O = Output, O.D. = Open Drain 
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Table 10. M82965 BXU AP-Bus (Local Agent) Support Signals 


Symbol 

Type 

Name and Function 

CLK2 

1 

SYSTEM CLOCK: Provides the base timing and synchronization for ail agents 
(BXUs) in the system. It Is sourced to all agents from a central clock and is 
twice the frequency of the bus cycle. 

NOTE: 

The clock skew over the AP-Bus for a typical system should be no greater 
than 6 ns for correct system operation. 

BOUT 

I/O 

O.D. 

BUS OUTPUT CONTROL: Is asserted whenever a component is driving the 
AP-Bus. Functional Redundancy Checks on BOUT can be used to detect 
arbitration failures. 

MODCHK 

I/O 

O.D. 

MODULE CHECK: Is connected between Master/Checker pairs, allowing a 
Functional Redundancy Check to be performed on internal states. 

INITID 

1 

INITIALIZE ID: Is connected to one of the 32 AD lines and is used in 
conjunction with the IDENTIFY DEVICE lAC to provide a unique address for 
each BXU at initialization time. 

Vref 

1 

VOLTAGE REFERENCE: Provides a stable voltage reference for the input 
buffers of components connected to the AP-Bus. External hardware must 
provide a Vref^W voltage (see Table 14) on the Vref pin during normal 
operation of the component. The Vref pin is also used to distinguish between 
a warm start (system memory and the Error Record register retain their state) 
and a cold start (system memory and BXU registers are cleared). 

RESET 

1 

RESET: Forces all agents on the bus to reset and synchronize. The bus cycle 
begins the first CLK2 period after RESET is deasserted. The RESET signal is 
the way a BXU is synchronized to the rest of the system. 

COM 

I/O 

O.D. 

COMMUNICATION: Can be used to load Information into a component as 
part of the initialization sequence or to inform external logic that the 
component has failed. The BXU will asserted COM if it has shut itself off due 
to a failure In its module. 

The COM signal Is not Involved in any aspect of AP-Bus operation, but can be 
used to load board-dependent information into the BXU or to signal the rest of 
the system that an external error has occurred. 


NOTES: 

I/O = Input/Output, I = Input, O = Output, O.D. = Open Drain 
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Table 11. M82965 BXU Module Support Signals 


Symbol 

Type 

Name and Function 

WYq/COR 

0 

WAYq: (When the BXU is in processor mode). Indicates which one of 


O.D. 

the two “ways” in a directory set had a cache hit. The line is intended 
to drive the SRAM address pins and will remain stable throughout the 
length of a cache access. 

CORRECT: (When the BXU Is in memory mode). Is used by the BXU to 
tell an external ECO controller to correct the memory data as it flows 
onto the local bus. If this signal is not asserted, then the memory data 
may flow directly onto the local bus with only error checking, but no 
correction. 

Wy^/mem 

0 

WAYi: (Defined only when the BXU Is in PROCESSOR mode). 


O.D. 

Indicates If the access is for the cache or private memory half of the 

SRAM. The line is Intended to drive the SRAM address lines directly 
and will remain stable throughout the length of a cache access. 
MEMORY/REGISTER REQUEST: (Defined only when the BXU is in 
MEMORY mode). This signal allows mapping some of the BXU’s 
register space out to the registers in an external controller. If the signal 
is high, the associated L-Bus request is a memory request; otherwise, 
the L-Bus request is to an external register on the board. 

Wd^/Dnc 

. I/O 

WORDq: (Only defined when the BXU Is in PROCESSOR mode). 


O.D. 

Provides the low order bit of the word address for the SRAM. Together 
with WORDi, the two bits indicate which of the four words within an 
address line should be addressed. Because SRAM timing is critical, an 
external latch could be required. The signals change for each word of 
data transferred. 

UNCORRECTABLE ECC: (Only defined when the BXU is in MEMORY 
mode). Is an input used by the external ECC logic to signal to the BXU 
that it has detected an uncorrectable memory error. 

WDi/ECC 

I/O 

WORD-i: (Defined only when the BXU is in PROCESSOR mode). 


O.D. 

Provides the high order bit of the word address for the SRAM. 

Together with WORDq, the two bits Indicate which of the four words 
within an address line should be addressed. Because SRAM timing is 
critical, an external latch will be required. The signals change for each 
word of data transferred. 

ECC ERROR: (Defined only when the BXU is in MEMORY mode). Is 
an input used by the external ECC logic to signal to the BXU that it has 
detected a memory error. The signal will be asserted even though 
external logic may be correcting the error and providing correct data 
on the L-Bus. If the BXU is asserting its CORRECT signal, the ECC 

ERROR signal will be ignored. Only the UNC pin will be checked for an 



error indication under these conditions. 

SSBUSY 

I/O 

SUBSYSTEM BUSY: Connects together all BXUs in a module that are 


O.D. 

in the same subsystem. When the signal is pulled low (BUSY), the 

BXUs will accept a request address, but will not continue with the data 
cycles. This signal is used to ensure that the BXUs always handle 
RMW-writes, Interagent Communication messages, and retries 
correctly. An external signal is needed because BXUs can generate 
AP-Bus requests internally because of the prefetcher, or their internal 
logic can be tied up handling an lAC request from the AP-Bus. 
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Table 11. M82965 BXU Module Support Signals (Continued) 


Symbol 

Type 

Name and Function 

POPQUE 

I/O 

O.D. 

POP QUEUE: Is used by the two BXUs acting as bus backups for each 
other to communicate status on the completion of outstanding L-Bus 
requests. Usually, this signal is asserted when the oldest write in the 
queue has completed. During the partner ordering period, a different 
protocol is used to convey the status of all write requests outstanding. 

LERLi-LERLo 

I/O 

O.D. 

LOCAL ERROR REPORTING LINES: Are identical to the BERL 
signals defined for the AP-Bus, but are used on the module side to 
connect all BXUs on a single L-Bus. 


NOTES: 

I/O = Input/Output, I = Input, O = Output, O.D. = Open Drain 


MECHANICAL DATA 


Pin Assignment 

The MG82965 BXU (PGA package) pinout as 
viewed from the top side of the component (pins 
down) is shown in Figure 17 and from the bottom 
side (pins up) in Figure 18. 


Vcc and GND connections must be made to multi- 
ple Vcc and GND pins. Each Vcc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. Pref- 
erably, the circuit board should include power and 
ground planes for power distribution. Table 12 lists 
the function of each pin. 

Many of the signals are multiplexed and several sig- 
nals have different interpretations depending on 
whether the BXU is used in Processor or Memory 
mode. 
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Figure 17. MG82965 BXU Pinout — View from Top Side (Pins Down) 
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Figure 18. MG82965 BXU Pinout — View from Bottom Side (Pins Up) 
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Table 12. M82965 PGA Pinout--ln Pin Order 


Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

Pin 

Signal 

A1 

LERLi 

C 6 

AD 22 

H1 

LAD 30 

M10 

Vss 

A2 

Vss 

C7 

AD 24 

H2 

READY 

M11 

Vcc 

A3 

POPQUE 

C 8 

AD 29 

H3 

BEi 

M12 

Vref 

A4 

ADi6 

C9 

SPECa 

H12 

ADi3 

M13 

BERLo 

A5 

A Drin 

cm 

Vcc 

H13 

ADis 

M14 

CHKi 

A 6 

AD 21 

C 11 

Vss 

H14 

AD 4 

N1 

LAD 23 

A7 

AD 25 

C12 

INITID 

J 1 

LAD 29 

N2 

LAD 24 

A 8 

AD 30 


ARB 2 

J2 

LAD 31 

N3 

LAD 22 


AD 26 


ADi 

J3 

CACHE 

N4 

LAD 21 


AD 28 

D1 

WDo/DNC 

J12 

BOUT 

N5 

LAD 18 

A11 

SPECo 

D2 

PFETCH 

J13 

COM 

N 6 

LADis 

A12 

SPEC 3 

D3 

Vss 

J14 

^8 

N7 

LAD 12 

A13 

SPECS 


ARBo 

K1 

LAD 28 

N 8 

LAD 1 O 


Vcc 

D13 

ADo 

K2 

LAD 26 

N9 

LADe 

B1 

Vss 


ADs 

K3 

LAD 27 

N10 

LAD 2 

B2 

IACq/ERR 

El 


K12 

BERLi 

Nil 

CLK2 

B3 


E2 

WYq/COT 

K13 

ADi4 

N12 

LADo 

B4 

ADi7 


WYi/MEM 

K14 

AD 10 

N13 

RESET 

B5 

ADi 8 

E12 

AD 3 

L 1 

ALE 

N14 

Vss 

B 6 

ADi9 

E13 

AD 7 

L2 

ADS 

P1 

< 

0 

0 

B7 

AD 23 


ARB 3 

L3 

HOLD 

P2 

Vss 

B 8 

AD 27 

FI 

BE 3 

L12 

Vss 

P3 


B9 

SPECi 

F2 

BE 2 

L13 

CiHKo 

P4 


B10 

AD 31 

F3 


L14 


P5 

LAD 16 

B11 

SPEC 4 


ADe 

Ml 

HLDA 

P 6 

LAD 14 


RPYDEF 


^9 

M2 

LAD 25 

P7 

LAD 11 


Vss 


ARBi 

M3 


P 8 

LAD 9 

mm 

Vss 


W/R 

M4 

Vss 

P9 

LAD 7 

ISH 

SSBUSY 


BEo 

M5 

Vcc 

P10 

LADs 

C2 

WDi/ECC 

G3 

LOCK 

M 6 

LAD 20 

P11 

LAD 4 

C3 

LERLo 

G12 

AD 11 

M7 

LAD 13 

P12 

LADi 

C4 

Vcc 

G13 

AD -12 

M 8 

LADb 

P13 

Vss 

C5 

Vss 

G14 

AD 2 

M9 

LAD 3 

P14 

Vcc 
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Table 13. M82965 Pinout— In Signal Order 


Signal 

PGA 

Pin 

Signal 

PGA 

Pin 

Signal 

PGA 

Pin 

Signal 

PGA 

Pin 

^0 

D13 

ALE 

L1 

LADs 

M 8 

SPECo 

All 

ADi 

C14 

ARBo 

D12 

LADg 

P 8 

SPECi . 

B9 

AD 2 

G14 

ARBi 

F14 

LAD 10 

N 8 

SPEC 2 

C9 

ADa 

E12 

^2 

C13 

LAD 11 

P7 

SPEC 3 

A12 

AD 4 

H14 

ARB 3 

E14 

LAD-J 2 

N7 

SPEC 4 

B11 

ADs 

D14 

BADAC 

M3 

LAD 13 


SPECS 

A13 

ADe 

mm 

BEo 

G 2 

LAD 14 

P 6 

SSBUSY 

Cl 

AD 7 

mm 

BEi 

H3 



Vcc 

A14 

ADs 


^2 

F2 

■S39I 

P5 

< 

0 

0 

C4 

ADg 


^3 

FI 

IQOHII 

P4 

Vcc 

CIO 



BERLo 

M13 



Vcc 

M5 



BERLi 

K12 

LAD 19 

P3 

Vcc 

M11 

AD 12 


BOUT 

J12 

LAD 20 


Vcc 

PI 



CACHE 

J3 

LAD 21 


Vcc 

P14 



CHKo 

L13 



Vref 

M12 



CHKi 

M14 

■BH 


Vss 

A2 


A4 

CLK2 

N11 

LAD 24 


Vss 

B1 

ADi7 

B4 

COM 

J13 



Vss 

B13 

ADi8 

B5 

OT/DT/R 

F3 



Vss 

B14 

ADi9 

B 6 

CW/DEN 

E1 

LAD 27 


Vss 

C5 

AD 20 

A5 

HLDA 

Ml 

LAD 28 

K1 

Vss 

C 11 


A 6 

HOLD 

L3 


J1 

Vss 

D3 


C 6 , 





Vss 

LI 2 


B7 



LAD 31 

J2 

Vss 

M4 


C7 



LERLo 

C3 

Vss 

M10 


A7 

LADo 

N12 

LERLi 

A1 

Vss 

N14 


A9 

LADi 

P12 

LOCK 

G3 

Vss 

P2 

AD 27 

B 8 

LAD 2 

N10 

MODCHK 

LI 4 

Vss 

P13 

AD 28 

A10 

LAD 3 

M9 

PFETCH 

D2 

Woo/mc 

D1 

AD29 

G 8 

LAD 4 

P1 1 

POPQUE 

AO 

r\\j 

WDi/ECC 

\jc. 

AD 30 

A 8 

LADs 

P10 

READY 

H2 

W/R 

G 1 

AD 31 

BIO 

LADe 

N9 

RESET 

N13 

WYq/COR 

E2 

ADS 

L 2 

LAD 7 

P9 

RPYDEF 

B12 

WY 1 /MEM 

E3 
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Package Dimensions and Mounting 

The MG82965 BXU is packaged in either a 132-lead 
ceramic pin-grid array (PGA) or a 164-pin CQP pack- 
age. (Contact factory for details on CQP availability.) 
Pins in the PGA package are arranged 0.100 inch 
(2.54 mm) center-to-center, in a 14 by 14 matrix, 
three rows around. See Figure 19. 

A wide variety of available sockets allows low-inser- 
tion or zero-insertion force rnuuniinyb, ciiiu ci uliOice 


of terminals such as soldertail, surface mount, or 
wire-wrap. Figure 20 shows several applicable sock- 
ets. 


Package Thermal Specification 

The M82965 BXU is specified for operation when its 
case temperature is within the range of -55“C to 
+ 125“C. The PGA case temperature should be 
.measured at the ce*^t<^r of +hfi ton surface oooosite 
the pins as shown in Figure 21. 
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Figure 19. A 132<Lead Pin-Grid Array (PGA) Used to Package the MG82965 BXU 


3-307 



M82965 




inl^. 


• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in in- 
sertion force compared to machined 
sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in version 
55573-2 

Amp Incorporated 

(Harrisburg, PA 17105 U.S.A 
Phone 717-564-0100) 



Amp LIF Socket 
55274-1 



Amp LIF Socket 


271082-21 


Cam handle locks in low profile position when MG82965 is installed 
(handle UP for open and DOWN for closed positions). 


Courtesy Amp Incorporated 


Peel-A-Way* and Kapton Sock- 
et Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (three-level) 

• Low insertion force press-fit 
CS132-05TG 

Advanced Interconnections 

(5 Division Street) 

Warwick, Rl 02818 U.S.A. 

Phone 401-885-0485) 


Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MSI 32 
Molded Plastic Body KS132 is 
shown below: 




271082-23 


Courtesy Advanced Interconnections 
(Peel-A-Way Terminal Carriers 
U.S. Patent No. 4442938) 


*Peel-A-Way is a trademark of Advanced Interconnections. 


Figure 20. Several Socket Options for Mounting the M82965 BXU 
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ELECTRICAL SPECIFICATIONS 


Power and Grounding 

The M82965 is implemented in CHMOS III technolo- 
gy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error, and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, seven Vcc 
and thirteen Vss P«ns separately feed functional 
units of the M82965. 

Power and ground connections must be made to all 
Vcc cind Vss pins of the M82965. On the circuit 
board, all Vcc Pins must be strapped closely togeth- 
er, preferably on a Vcc plane. Likewise, all Vss pins 
should be strapped together, preferably on a ground 
plane. 


Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the M82965. The BXU when driving Its two 32- 


blt address/data buses (AP-Bus and L-Bus) can 
cause transient power surges, particularly when driv- 
ing large capacitive loads. 

Low inductance capacitors and Interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the BXU and decoupling 
capacitors as much as possible. 


Connection Recommendations 

For reliable operation, always connect unused in- 
puts to a n approp riate signal level. In particular, if 
PFETCH or LERLo_i are not used, they should be 
pulled up and if the CACHE input is not used (i.e., 
BXU operating In the Memory mode) it should be 
tied low through a 1 0 kfl resistor. No inputs should 
ever be left floating. 

All open-drain outputs require a pullup device. While 
in most cases a simple pullup resistor will be ade- 
quate, a network of pullup and pulldown resistors 
biased to a valid V|h (e.g., 3.5V) will limit noise and 
AC power consumption, especially on the AP-Bus. 
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ABSOLUTE MAXIMUM RATINGS* 

Case Temperature 

under BiasC*) - 55°C to + 1 25'’C Case 

Storage Temperature -65°C to + 150®C 

Voltage on Any Pin -0.5V to Vcc + 0.5V 

Power Dissipation 2.5W 


NOTICE: This data sheet contains information on 
products in the sampling and initial production phases 
of development. The specifications are subject to 
change without notice. Verify with your local Intel 
Sales office that you have the latest data sheet be- 
fore finalizing a design. 

* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions” is not recommended and ex- 
tended exposure beyond the "Operating Conditions” 
may affect device reliability. 


Operating Conditions 


Symbol 

Description 

Min 

Max 

Units 

Tc 

Case Temperature (Instant On) 

-55 

+ 125 

“C 

Vcc 

Digital Supply Voltage 

4.75 

5.25 

V 


Table 14. D.C. Characteristics (Over Specified Operating Conditions) 


Symbol 

— 

Parameter 

Min 

Max 

Units 

Comments 

V|L 

Input Low Voltage 

-0.3 

+ 0.8 

V 


Vila 

Input Low Voltage; AP-Bus 

-0.5 

+ 1.0 

V 


V|H 

Input High Voltage 

2.0 

Vcc + 0.3 

V 


V|HA 

Input High Voltage: AP-Bus 

2.0 

Vcc 

V 


Vref/c 

Vref Trip Point Cold Start 

< 

o 

o 

Lj 


V 


Vref/w 

Vref Trip Point Warm Start 

1.7 

1.8 

V 


VcL 

CLK2 Input Low Voltage 

-0.3 

+ 1.0 

V 


VCH 

CLK2 Input High Voltage 

0.55 Vcc 

Vcc 

V 


VoL 

Output Low Voltage: 

Iql = 4 mA: LAD Lines 


0.45 

V 



Iql = 5 mA: Controls(2) 


0.45 

V 



Iql = 25 mA: L-Bus 


0.45 

V 



Open-Drain 

Outputs 






Iql = 80 fTiA; AP-Bus 


0.70 

V 



Open-Drain 

Outputs 





VOH 

Output High Voltage: 
loH = 1 LAD Lines 

2.4 


V 



lOH = 0.9 mA: Controls(2) 

2.4 


V 



lOH = 5.0 mA: ALE 

2.4 


V 


tc 

Power Supply Current 


450 



111 

Input Leakage Current ! 


±15 


ov <: v'o s Vcc 

•lo 

Output Leakage Current 


±15 



C|N 

Input Capacitance 


10 

pF 

Note 1 

Co 

I/O or Output Capacitance 


12 

pF 

Note 1 

CcLK 

Clock Capacitance 


' 12 

pF 

Note 1 


NOTES: 

1. Test frequency = 1 MHz, Tc = 25°C, unmeasured pins at GND. 

2. “Controls” include all L-Bus I/O pins not otherwise specified. 
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A.C. SPECIFICATIONS 

This section describes the A.C. specifications for the 
M82965 pins. All Input and output timings are speci- 
fied relative to the 1 .5V level of the rising edge of 
CLK2, and refer to the time at which the signal 
reaches (for output delay and Input setup) or leaves 
(for hold time) the TTL levels of LOW (0.8V) or HIGH 
(2.0V). 

All A.C. testing should be done with input voltages or 
0.45V and 2.4V. 


Maximum output hold times are the same as mini- 
mum output delays. Tri-state signals have no resis- 
tive load or termination. 

The Output Delay specified for open-drain signals 
includes both the low to high and high to low tran- 
sitions. The float delay is the amount of time that the 
pulldown transistor may remain active. This specifi- 
cation is provided to help system designers calcu- 
late propagation delay for terminations other than 

lilt; une uS6u lOr teoting. 


3 
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Table 15. M82965 A.C. Timing Specifications (Over Specified Operating Conditions) 


Symbol 

Parameter 

Min 

Max 

Units 

Comments 

Ti 

Clock Period 

31.25 

125 

ns 

V|N = 1.5 V 

Ta 

Clock Low Time 

11 


ns 

V|N = 10% Point = 1.2V 

Ta 

Clock High Time 

11 


ns 

V|N = 90% Point = 0.1V + 0.5 Vcc 

T4 

Clock Fall Time 


10 

ns 

V|N = 90% Point to 1 0% Point 

Ts 

Clock Rise Time 


10 

ns 

V|N = 10% Point to 90% Point 

Te 

Output Valid Delay: 

LAD 

4 

35 

ns 

Cl = 100 pF 


WY 

4 

35 

ns 

Cl = 125 pF 


CW, WD, SS Busy 

4 

30 

ns 

Cl = 75 pF 


CR 

4 

45 

ns 

Cl = 75 pF 


Controls(i) 

2 

35 

ns 

Cl = 75 pF 

T7 

ALE Width 

15 


ns 

Cl = 75 pF 

Ts 

ALE Invalid Delay 


20 

ns 

Cl = 75 pF 

T9 

Output Float Delay: 

LAD 

5 

20 

ns 

Cl = 100 pF 


WY 

5 

22 

ns 

Cl = 125 pF 


Controlsf"*) 

5 

22 

ns 

Cl = 75 pF 

Tl 0 

Input Setup Time: 






LOCK, HOLD, HOLDAR, READY 

8 


ns 

10% Point 


ECC, UNC 

15 


ns 

10% Point 


Controls(i) 

3 


ns 

10% Point 

Til 

Input Data Hold 

10 


ns 

90% Point 

Ti2 

Setup to ALE Inactive 

10 


ns 

Cl = lOOpF(LAD) 

Cl = 75 pF (Controls) 

Ti3 

Hold after ALE Inactive 

8 


ns 

Cl = 100 pF (LAD) 

Cl = 75 pF (Controls) 

Ti4 

RESET Hold 

5 


ns 


Ti5 

RESET Setup 

8 


ns 


Tie 

RESET Width 

1250 


ns 

40 CLK2 Periods Minimum 

Ti7 

Clock to Data Valid 


17 

ns 

Cl = 50 pF 


(AP-Bus) 




Iql = 50 mA 

Tie 

Clock to High 

Impedance (AP-Bus) 


14 

ns 


Ti9 

Output Hold 

5 


ns 

Cl = 50 pF 


(AP-Bus) 




lOL = 50 mA 

T20 

Input Setup (AP-Bus) 

7 


ns 


T 21 

Input Hold (AP-Bus) 

10 


ns 



NOTE: 

1. “Controls” include all L-Bus I/O pins not otherwise specified. 
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Figure 22. CLK2 Timing 


B C D A, 


READY, CR, 
BAD^.i^,.iACo 


^mmmm 


C D At B 


immsmmsk 


CW.WYi.WYq. 

WD1.WD0 




msmm 


yjmmm 


INPUTS: 

LAD3,-LADo, 
CACHE, W/R, 


*NOTE: 

LERL signals must be asserted at both edges A2 and A3 in order for them to be recognized by the BXU. 


Figure 23. Drive Levels and Measurement Points for A.C. Specifications. 
L-Bus Timings for the BXU as a Bus Slave 
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Figure 24. Drive Levels and Measurement Points for A.C. Specifications. 
L-Bus Timings for the BXU as a Bus Master 
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Figure 25. Relative Timing for L-Bus Signals 
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BUS CYCLE 

\ phase- j 

p PHASE 2 

OUTPUTS: 


U 




mmmmmm 


Ih^PUTS: 

^P(3 1-0)» 

SPE C(5-0)» 

RPYDEF 


'20 *21 ' 
VALID 


■ T20 -I**- ^2^ 

VALID SAMP #1 


'20 '21 ' 
VALID 


VALID SAMP #2 


*NOTE: 

BERL signals 


must be asserted at both edges Ap and A3 in order for them to be recognized by the BXU. 


Figure 26. Relative Timing for AP-Bus Signals 



THIS CLK2 EDGE ' | T, ^ = RESET WIDTH | 


Figure 27. RESET Setup and HOLD Timing 
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L-BUS DESIGN CONSIDERATIONS 

Input hold times can be disregarded by the designer 
whenever the input is removed beca use o f a subse- 
quent output from the BXU (e.g., DEN becomes 
deasserted). In other words, whenever the BXU gen- 
erates an output that indicates a transition into a 
subsequent state, the BXU will have sampled any 
inputs for the previous state. 

As an example, in tne recovery ( i r) c ycle T oiiowing a 
read, the minimum time (tg Min) that DEN becomes 
asserted is specified to be less tha n the minimum 
hold time on the data (t-ii Min)- When DEN is assert- 
ed, however, the data is guaranteed to have been 
sampled. 

Similarly, whenever the BXU generates an output 
that Indicates a transition to a subsequent state, any 
outputs that are specified to be tri-stated in this new 
state will be tri-stated. 

For example, in the data (1^) cycle following an ad- 
dress (Ta) cyc le for a read, the minimum output de- 
lay fte Min) den is specified to be less t han t he 
maximum float time of LAD (tg Max)- When DEN is 
asserted, however, the LAD outputs are guaranteed 
to have been tri-stated. 


AP-BUS SIGNAL TIMING 
CONSIDERATIONS 

The AP-Bus uses three-quarter cycle signaling for 
data transmission. Data is driven on edge D and 
sampled on edge C. This approach allows three- 
quarters of the bus cycle to be used for data trans- 
mission. 

The remaining (one-quarter) time allows for clock 
skew and signal hold time. All AP-Bus signals except 
for the ARB, CHK, and BERL signals use this timing. 
The relationship of the AP-Bus signals is shown in 
Figure 28. 


The CHK signals (interlaced parity) are delayed by 
one-half cycle or one phase to allow for generation 
of parity from the internal data that is being transmit- 
ted. The CHK lines are sampled one phase after the 
data has been sampled and compared against the 
parity generated for the received data. 

Most input signals on the AP-Bus are sampled on 
the rising edge of CLK2 at edge C. The exceptions 
are the error signals CHK, BERL and ARB, which 
clie bcliiipitiju uii liie iiSniy eclye Of CLK2 at 6uye A. 
Regardless of the edge, the setup and hold times 
are the same. 


All outputs on the AP-Bus are driven relative to the 
falling edge of CLK2 at the middle of phase 2, ex- 
cept CHK, BERL and ARB, which transition on the 
falling edge of CLK2 at the middle of phase 1 . 


When designing a system based on the AP-Bus, the 
system topology will be limited by the available prop- 
agation time for signals in the system. The propaga- 
tion time must allow for settling of ringing, ground 
shift, and crosstalk, all of which are dependent on 
board and system materials and design. 



The following equation gives the propagation time 
available, given a specific clock implementation and 
frequency: 

TpROP = 2 Ti - (T3 + T4 + T5 + (Tia or Tig) + Tioa + ^skew) 


Where Tskew 's the worst case clock skew between 
BXUs (clock skew Is the time delay between any two 
clocks in the system due to physical distribution lim- 
its). 

In AP-Bus systems, this skew Is defined as follows: 


Tskew ^ T3 + T20 - T11 


L-Bus Waveforms 

Figures 30 through 36 illustrate the relationship of L- 
Bus signals during a variety of bus transactions. For 
a detailed discussion of the operation of the L-Bus, 
consult the 60960MC Hardware Designer’s Refer- 
ence Manual. 
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Figure 29. System and Processor Clock Relationship 
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Figure 34. Hold Timing 
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Figure 36. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 
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85C960 

1-MICRON CHMOS 

80960 K-SERIES BUS CONTROL jixPLD 


■ Burst Logic, Ready Control, and 
Address Decode Support for 80960 
KA/KB Embedded Controllers in Single 
Chip 

mm w«4|-»|^v*i k^wiii wiiUiiviciivi 

and New Generation ''Burst Mode” 
Memories and Peripherals 

■ Ready/Timing Control Supports 0-15 
Wait States across 8 Address Ranges, 
Read/Write Accesses, Burst 
Transactions 

■ 8 Dedicated Inputs Decoded into 8 
Latched Chip Selects (4 External/ 
Internal; 4 Internal Only) 

*CHMOS is a patented technology of Intel Corporation. 


■ Operates with 80960KA/KB at 20 MHz 
and 25 MHz 

■ Ice = mA Max. 

■ UV Erasable (CerDIP) or OTPtm 

■ 100% Generically Testable Logic Array 

■ Based on Low Power CHMOS HIE* 
Technology 

■ Available in 28-Pin 300-mil CerDIP and 
PDIP Packages and in 28-Pin PLCC 
Package 

(See Packaging Spec., Order Number #231369) 



Figure 1. Pinout Diagram 
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GENERAL DESCRIPTION 

The Intel 85C960 is a single-chip burst/ready/de- 
code julPLD (Microcomputer Programmable Logic 
Device) designed to interface 80960 KA/KB embed- 
ded controllers to system memory and I/O. The 
85C960 provides programmable chip selects, a pro- 
grammable read/write access wait state/ready gen- 
erator, and burst address (A2, A3) cycling. Burst 
transaction cycling of A2, A3, and WCLK# (Write 
Clock) is also supported for Intelligent peripherals on 
the bus. 

For Its programmable functions, the 85C960 uses 
advanced EPROM cells as logic array and wait-state 
table memory elements. Coupled with Intel’s propri- 
etary CHMOS HIE technology, the result is a pro- 


grammable device able to support Intel’s 32-bit 
80960 KA/KB embedded controllers at speeds up to 
25 MHz. 


ARCHITECTURE DESCRIPTION 

The 85C960 jliPLD Integrates burst control, ready 
generation, and chip select decoding into a single 
device. Figure 2 shows the architecture of the 
85C960. Table 1 lists and describes each signal on 
the device. The 85C960 replaces 6-10 separate 
PLD/discrete logic devices in small- and medium- 
sized 80960 systems. For medium- to large-sized 
systems, the 85C960 can be supplemented with an 
additional decoder, such as the 85C508, and a sec- 
ond 85C960. Figure 3 shows a single 85C960 in a 
typical application. 
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Figure 3. 85C960 in an 80960 System 
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Table 1. 85C960 Pin Descriptions 


Symbol 

Type 

Name and Function 

RESET 

I 

RESET. When RESET is high for a minimum of four CLK2 cycles, internal 
circuits are reset to a known state. 

I7-I0 

I 

INPUT 7- INPUT 0. These are the address range inputs to the 
programmable decode logic array. 

CLK2 

I 

SYSTEM CLOCK. This input, which connects to the 80960 CLK2 signal, 
provides the timing reference for. all 85C960 operations. 

AD3-AD0 

I 

ADDRESS IN 3-ADDRESS IN 0. These inputs are driven by LAD0-LAD3 
from the Local Bus (L-Bus) to provide addressing and burst access decode 
information. 

W/R# 

I 

WRITE/READ. Write/Read from controller. When low, indicates that the 
current access is a read. When high, indicates that the current access is a 
write. 

DEN# 

I 

DATA ENABLE. This input from the controller Indicates that data is present 
on the L-Bus. 

ADS# 

I 

ADDRESS/DATA STROBE. This input from the 80960 indicates whether 
address or data information is currently on the L-Bus. When low, address 
information is changing. The 85C960 chip select timing is based in part on 
ADS# low during Ta states. 

BLAST# 

0 

BURST LAST. This signal, when low, indicates that the current read/write 
access is the last access in a burst transaction. BLAST # is not cycled if 

RDY# is generated off-chip. 

WCLK# 

0 

WRITE CLOCK. This output provides a write enable strobe to memories that 
do not support burst mode access. 

A3, A2 

0 

ADDRESS OUT 3, 2. These outputs cycle during burst transactions. 

Typically connected to lowest memory address signals. 

CS3#-CS0# 

0 

CHIP SELECT 3-CHIP SELECT 0. Single p-term select outputs that are 
driven active (low) for the programmed address condition on I7--I0. 

RDY# 

I/O 

READY. RDY# is an active low, bidirectional, open-drain signal that should 
be connected to the controller’s Ready input. As an output, RDY# goes high 
to cause the controller to extend the current access. RDY# goes low to 
indicate that the data on the L-Bus bus may be sampled (read) or removed 
(write). RDY# is controlled by the 85C960 Ready Generation and Wait-State 
Logic. The open-drain output allows RDY# to be OR-tied to other circuitry 
that may drive the controller’s Ready input. As a bidirectional input, RDY# 
allows the 85C960 to provide Ready timing and burst cycling for intelligent 
peripherals that do not generate these signals themselves. 


\ 
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80960 L-Bus (Local Bus) cycles are monitored by 
the Bus State Tracker to synchronize the functional 
blocks in the 85C960 to the L-Bus. CLK2 provides 
the timing reference for all 85C960 operations. 

Four external chip selects (CS0#-CS3#) are gen- 
erated by the programmable Chip Select Decoder. 
These four signals provide decoded selects to mem- 
ory and I/O devices and are routed to the program- 
mable Wsit~State Table sc that ■*'h'^ 
generate RDY# at the appropriate time. Four addi- 
tional selects are decoded (internal only) and routed 
to the Wait-State Table so that the 85C960 can gen- 
erate RDY# for up to four additional address 
ranges. 

The Ready Generation block generates RDY# to 
the controller under control of the Wait-State Table. 
Depending on the contents programmed into this ta- 
ble and the current type of access, from 0-15 wait 
states can be introduced into each bus cycle. An 
independent wait state value can be chosen for 
each select and each access type. Four access 
types are possible: read first, read subsequent, write 
first, and write subsequent. 


and complements of all inputs (I7-I0) are available 
to all eight NAND p-terms. 

Each intersecting point in the logic array is connect- 
ed or not connected based on the value pro- 
grammed in the EPROM array. Initially (EPROM 
erased state), no connections exist between any 
p-term and any input. Connections can be made by 
programming the appropriate EPROM cells. Since 
O-terms aje imnipmpntpri PR NANDr. a true condi- 
tion on a p-term drives the output low. Current con- 
sumption is higher when both true and complement 
p-terms for the same input are programmed. 

Selects are latched on the falling edge of an internal 
Latch Enable (LE), which is generated from ADS#, 
DEN#, and CLK2. The proper combination of these 
signals occurs during an 80960 address state (Ta). 
Figure 5 shows the relationship of the internal LE 
and external chip selects to the three signals at the 
end of a Ta state. All selects are cleared to an inac- 
tive high state at the start of a recovery state. (Tr). 
All eight selects (four external and four internal) are 
routed, to the Wait-State Table. 


The Burst Control and Address Counter blocks 
control burst transaction timing to memory and I/O. 
Note that the RDY# pin is sampled by the Burst 
Control block to allow the 85C960 to generate burst 
transaction timing for other bus peripherals. WCLK# 
provides a write enable strobe for memory and I/O 
that do not support burst mode. BLAST# Informs, 
burst-mode devices that the current access is the 
last one in a burst transaction. A2 and A3 are cycled 
to select the address location for each access. 


FUNCTIONAL DESCRIPTION 

The following paragraphs provide a detailed descrip- 
tion of each functional block in the 85C960 julPLD. 


Chip Select Decoder 

The Chip Select Decoder, shown In Figure 4, is a 
high speed, single p-term (product-term) latched de- 
coder circuit with eight inputs (I0-I7) and eight 
latched outputs. Each output goes low when Its as- 
sociated product term is true. Four of these outputs 
(CS0#-CS3#) are available externally to be used 
as device selects. The remaining four outputs 
(CS4#-CS7#) are available internally so that the 
85C960 can provide ready and burst timing for four 
more device selects. (The actual selects for these 
four additional devices/resources must be generat- 
ed by external logic.) 


Wait State Table 

Chip selects, WR (Write/Read), and SW (Subse- 
quent Word) feed the Wait-State Table. Each chip 
select points to a set of four wait state values while 
WR and SW determine which of the four values to 
route to the Ready Generation block (see Figure 6). 
The four values are grouped into read and write 
groups with each group having a value for the first 
access and subsequent access (second through 
fourth). The four-bit wait-state value is sent to the 
Ready Generation block (via WS0#-WS3#) to be 
used as an initial count value. If two selects are ac- 
tive, the resulting count value is the logical bit AND 
of the two individual values. If more than two selects 
are active and the Individual count values are not the 
same, the resulting count value is indeterminate. If 
no select is active, no count value is loaded (and the 
Ready Generation circuit is disabled). 



Ready Generation 

RDY# is high at the start of each burst transaction. 
The RDY Generator begins to count down from the 
wait state value, decrementing the counter at the 
start of each wait state. When the internal counter 
reaches 0000, RDY# is pulled low (CLK2c during 
the data state). On the next CLK2c edge (for a wait 
state), RDY# Is released, allowing an external resis- 
tor to pull RDY# high. Figure 7 shows the timing for 
a four-word burst write transaction with 1 wait state 
for the first access and 0 wait states for the remain- 
ing three accesses (Burst Write 1 -0-0-0). 


The input to each latch is a single NAND p-term that 
can be connected to the dedicated Inputs. The true 
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CLK2a edge of a Tr state. If a Read or Write access 
occurs without a chip select having been decoded 
on-chip, the RDY# output buffer is disabled and 
RDY# is sampled as an input. This allows the 
85C960 to cycle A2, A3, and WCLK# to provide 
burst transaction timing for other bus controllers. 
RDY# may be OR-tied with other bus controllers so 
they can access the processor Ready signal. 



Figure 4. 85C960 Chip Select Decoder Block 


RDY# Is an open-drain I/O pin, which must be con- 
nected to pullup and pulldown resistors as shown in 
Figure 8. During a wait-state access. RDY# is pulled 
high to cause the controller to extend the current 
access so that the memory or peripheral chip has 
time to present data to the bus (read), or sample 
data on the bus (write). RDY# is released on the 
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Latch opens when CLK2 and DEN# go high and ADS# goes low. 

Latch closes when DEN# goes low or ADS# or CLK2 go high. 


Figure 5. Internal LE and External Chip Select Timing 


Burst Transactions 

ADS, AD2 are latched to indicate the starting ad- 
dress of a burst transaction. The 85C960 places 
these two signals out on AS and A2, respectively, 
then cycles the two addresses upward until the last 
access of the burst. The 85C960 assumes that the 
processor handles splitting of the burst transaction 
when a 1 6-byte boundary is crossed. 

ADO and AD1 specify the size of the burst transfer in 
double-words as shown in Table 2. 


Table 2. ADO- ADI vs Burst Size 


AD1 

ADO 

No. of 

Words Transferred 

0 

0 

1 

0 

1 

2 

1 

0 

3 

1 

1 

4 


WCLK#, BLAST # Generation 

WCLK# is the write enable signal for writing to non- 
burst mode memories. When low, address outputs 
A2 and A3 are valid. Its trailing edge (low-to-high 
transition) can be used to latch data into non-burst 
mode memories. WCLK# is only provided during 
writes; during reads, WCLK# remains high. 



BLAST # indicates that the current access is the last 
access in a burst transaction. BLAST # is used by 
burst-mode memories to reset internal address 
counters. BLAST # is not cycled when RDY# is gen- 
erated off-chip. 


POWER-ON CHARACTERISTICS 

85C960 inputs and outputs begin responding 1 jlls 
(max.) after Vcc power-up (Vcc = 4.75V) or after a 
power-loss/power-up sequence. RESET must be 
synchronous to CLK2 and must be held high for a 
minimum of 4 clock cycles after Vcc reaches 4.75 V: 
After 4 clock cycles, A2 and A3 are high, CSO#- 
CS3# (and CS4#-CS7#), BLAST#, WCLK# are 
high, and the open drain RDY# signal is inactive. 
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Select 

CSOf# 

Write/Read 

WR = 0 
(Read) 

WR = 1 
(Write) 

sw=o 

msb Isb 

msb Isb 

(First Word) 

0000 

0000 

SW=1 

msb Isb 

msb Isb 

(Subsequent Word) 

0011 

0010 


msb = most significant bit 
Isb = least significant bit 


Figure 6. Example Wait-State Entries for CSOf # 


ERASURE CHARACTERiSTICS 


Erasure time for the 85C960 is^ 20 minutes at 
12,000 jaWsec/cm2 with a 2537A UV lamp. 


Erasure characteristics of the device are such that 
erasure begins to occur upon exposure to light with 
wavelengths shorter than approximately 4000A. It 
should be noted that sunlight and certain types of 
fluorescent lamps have wavelengths in the 3000A- 
4000A range. Data shows that constant exposure to 
room level fluorescent lighting could erase the typi- 
cal 85C960 in approximately two years, while it 
would take approximately two weeks to erase the 
device when exposed to direct sunlight. If the device 
is to be exposed to these lighting conditions for ex- 
tended periods of time, conductive opaque labels 
should be placed over the device window to prevent 
unintentional erasure. 

The recommended erasure procedure for the 
85C960 is exposure to shortwave ultraviolet light 
with a wavelength of 2537A. The integrated dose 
(i.e., UV intensity x exposure time) for erasure 
should be a minimum of fifteen (1 5) Wsec/cm^. The 
erasure time with this dosage is approximately 20 
minutes using an ultraviolet lamp with a 1 2,000 /xW/ 
cm2 power rating. The device should be placed with- 
in 1 inch of the lamp tubes during exposure. The 
maximum integrated dose the 85C960 can be ex- 
posed to without damage is 7258 Wsec/cm2 (1 
week at 12,000 jaVV/cm2). Exposure to high intensity 
UV light for longer periods may cause permanent 
damage to the device. 


LATCH-UP IMMUNITY 

All of the input, output, and clock pins of the device 
have been designed to resist latch-up which is Inher- 
ent in inferior CMOS processes. The 85C960 is de- 
signed with Intel’s proprietary 1 -micron CHMOS 
EPROM process. Thus, each of the pins will not ex- 
perience latch-up with currents up to ±1 00 mA and 
voltages ranging from -0.5V to (Vcc + 0.5V). The 
programming pin Is designed to resist latch-up to the 
13.5V max. device limit. 


DESIGN RECOMMENDATIONS 

For proper operation, it is recommended that all in- 
put and output pins be constrained to the voltage 
range GND ^ (V|n or Vout) ^ Vcc- All unused in- 
puts should be tied high or low to minimize power 
consumption (do not leave them floating). Unused 
outputs may be left floating. A high-speed ceramic 
decoupling capacitor of at least 0.2 juF must be con- 
nected directly between the Vcc and GND pin. 

As with all CMOS devices, ESD handling procedures 
should be used with the 85C960 to prevent damage 
to the device during programming, assembly, and 
test. 


FUNCTIONAL TESTING 

Since the programmable sections of the 85C960 are 
controlled by EPROM elements, the device is com- 
pletely testable during the manufacturing process. 
Each programmable EPROM bit controlling the in- 
ternal logic Is tested using application independent 
test patterns. EPROM cells in the device are 100% 
tested for programming and erasure. After testing, 
the devices are erased before shipments to the cus- 
tomers. No post-programming tests of the EPROM 
array are required. 

The testability and reliability of EPROM-based pro- 
grammable logic devices Is an important feature 
over similar devices based on fuse technology. 
Fuse-based programmable logic devices require a 
user to perform post-programming tests to insure 
device functionality. During the manufacturing pro- 
cess, tests on fuse-based parts can only be per- 
formed in very restricted ways in order to avoid pre- 
programming the array. 
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Figure 7. Burst Write Transaction (1-0-0-0) 
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lOL = 28.8 mA 


VoH = 3.0V 



Figure 8. RDY# Pullup/Pulldown Resistors 


IN-CIRCUIT RECONFIGURATION 

The 85C960 allows in-circuit configuration changes 
after the device has powered up; At power-up, the 
device is configured according to the information 
programmed into the EPROM cells. After power-up, 
new information can be shifted in on select pins to 
alter device configuration. The new configuration is 
retained until the device is powered down or until the 
inforrriatlon is overwritten by another configuration 
change. 


ORDERING INFORMATION 


Note that in-circuit configuration changes allow “on- 
the-fly” changes to be made, but do not alter 
EPROM cell data. At the next power-up, the device 
will be configured according to the original data pro- 
grammed into the EPROM cells. In-circuit reconfigu- 
ration requires additional circuitry external to the 
85C960. For details on in-circuit configuration 
changes, refer to AP-337, In-Circuit Reconfiguration 
of 85C960 and 85C508 \xPLDs, order number 
292072. 


DESIGN SOFTWARE 

Software support is provided by version 2.1 (or later) 
of IPLS II (Intel Programmable Logic Software II). 
Programming is supported on the iUP-PC PC-based 
programmer or iUP-200A/201A Universal Program- 
mer via the GUPI base module and the GUPI 
85EPLD28 programming adaptor. 

For detailed information on iPLS II, refer to the 
iPLDS II Data Sheet, order number: 290134. The 
tools section of the Programmable Logic handbook 
contains a complete listing of all design tools for In- 
tel EPLDs. 


80960KA/KB 

Clock Frequency 

jixPLD Order Code 

Package 

Operating Range 

20 MHz 

*D85C960-20 

CERDIP 

Commercial 

N85C960-20 

PLCC 

25 MHz 

*D85C960-25 

CERDIP 

Commercial 

N85C960-25 

PLCC 


‘Only windowed CERDIP allows UV-erase. 
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ABSOLUTE MAXIMUM RATINGS^ 

Supply Voltage (Vcc)^^^ • -2.0V to +7.0V 

Programming Supply 

Voltage (Vpp)(i ) - 2.0V to -f 1 3.5V 

D.C. Input Voltage (V|)(1 . 2) ... - o.5V to Vcc + 0.5V 

Storage Temperature (Tstg) -65^*0 to + 1 50°C 

Ambient Temperature (Ta)^^) - 1 0°C to + 85°C 

NOTES: 

1. Voltages with respect to GND. 

2. Minimum D.C. input is -0.5V. During transitions, the in> 
puts may undershoot to -2.0V or overshoot to +7.0V for 
periods of less than 20 ns under no load conditions. 

3. Under bias. Extended Temperature versions are also 
available. 


NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


RECOMMENDE 

D OPERATING CONDITION 

IS 

Symbol 

Parameter 

Min 

Max 

Units 

Vcc 

Supply Voltage 

4.75 

5.25 

V 

V|N 

Input Voltage 

0 

Vcc 

V 

Vo 

Output Voltage 

0 

Vcc 

V 

Ta 

Operating Temperature 

0 

+ 70 

”0 
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D.C. CHARACTERISTICS (Ta == 0°Cto +70‘’C, Vcc = 5.0V ± 5%) 


Symbol 

Parameter 

Min 

Typ 

Max 

Unit 

Test Conditions 

Vihi(4) 

High Level Input Voltage 
(All Inputs except for 

ADS#, AD0-AD3, DEN#, 
and W/R#) 

2.0 


Vcc + 0.3 

V 


V|H2^'^^ 

High Level Input Voltage 
for ADS#, ADO- AD3, 

DEN#, and W/R# 

2.2 



V 


V,l(4) 

Low Level Input Voltage 

-0.3 


0.8 

V 


VoH 

High Level Output Voltage 

2.4 



V 

•oh = “4.0 mA D.C., 

Vcc = Min. 

VoL1 

Low Level Output Voltage 



0.4 

V 

Iql “ 4.0 mA D.C., Vcc ~ Min., 

Cl = 30 pF 

VoL2 

Low Level Output Voltage 
for A2, A3 



0.45 

V 

Iql ~ 24 mA D.C., Vcc ~ Min., 

Cl = 60 pF 

V0L3 

Low Level Output Voltage 
for Open Drain (RDY#) 



0.5 

V 

Iql “ 30 mA D.C., Vcc ~ Min., 

Cl = 30 pF 

ii 

Input Leakage Current 



±10 

jiA 

Vcc = Max., 

GND < V|N ^ Vcc 

•oz 

Output Leakage Current 



±10 

juiA 

Vcc = Max., 

GND < Vqut ^ Vcc 


Output Short Circuit Current 

-30 


-90 

mA 

Vcc ^ Max., Vqut ^ 0.5V 

Icc 

Power Supply Current 


10 

50 

mA 

Vcc = Max., V|N = Vcc or GND, 
No Load, CLK2 = 50 MHz 


NOTES: 

4. Absolute values with respect to device GND; all over and undershoots due to system or tester noise are included. 

5. Not more than 1 output should be tested at a time. Duration of that test should not exceed 1 second. 


A.C. TESTING LOAD CIRCUIT (RDY#) 



See D.C. Characteristics Table for Current and Capaci- 
tance Specifications. 

D1 and D2 are matched. 


A.C. TESTING LOAD CIRCUIT 



290192-18 

See D.C. Characteristics Table for Current and Capaci- 
tance Specifications. 

D1 and D2 are matched 
D3 and D4 are matched 
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A.C. TESTING WAVEFORM— SYNCHRONOUS INPUTS AND OUTPUTS 


CLK2 


INPUT (SETUP 
AND HOLD) 


OUTPUTS 



290192-10 

A.C. Testing: Inputs are driven at 2.4V for a Logic “1” and 0.4V for a Logic “0”. CLK2 is driven at 3.0V for a Logic “1” 
and 0.45V for a Logic “0”. Timing Measurements made relative to CLK2 are made from 1.5V on CLK2. Inputs and 
outputs are measured at 2.0V for a high and 0.8V for a low. Device input rise and fall times are less than 3 ns. 


A.C. TESTING WAVEFORM— ASYNCHRONOUS INPUTS AND OUTPUTS 




INPUTS y 

^2.0^ 

^ TEST POINTS 




OUTPUTS ^ 

' 2.0^ 

^ TEST POINTS 

A.C. Testing: Inputs are driven at 2.4V for a Logic “1” and 0.4V for a 1 
high-to-low and low-to-high transitions. Outputs are measured at 2.0V for 
fall times are less than 3 ns. 

290192-11 

.ogic “0”. Input timing is measured at 1.5V for 
a high and 0.8V for a low. Device input rise and 
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A.C. CHARACTERISTICS (Ta = O'Cto +70''C,Vcc = 5.0V ±5%) 


Symbol 

Parameter 

85C960-25 

85C960-20 

Units 

Min 

Max 

Min 

Max 

ti(6) 

Input Setup to CLK2a 

12 


15 


ns 

t2(6) 

Input Hold from CLK2a 

2 


2 


ns 

t3 

CLK2a to A2, A3 Valid Delay 

0 

8 

0 

10 

ns 

U 

CLK2c to RDY# Output Low Delay 


10 


15 

ns 

t5(7) 

CLK2c to RDY# Output High Delay 


10 


15 

ns 

te 

CLK2ato CS0#-CS3# High Delay 

5 

40 

5 

50 

ns 

t? 

CLK2a to BLAST # Low Delay 

' 

20 


20 

ns 

ts 

CLK2ato BLAST# High Delay 

5 


5 


ns 

t9(8) 

CLK2b to WCLK# Low Delay 

0 

10 

0 

12 

ns 

tio(8) 

CLK2d to WCLK# High Delay 

0 

10 

0 

12 

ns 

tii(9) 

ADS# Low to CS0#-CS3# Low Delay 


10 


12 

ns 

ti2(9) 

CLK2c to CS0#-CS3# Low Delay 


12 


15 

ns 

ti3(10) 

I0-I7 Setup to CLK2a 

5 


7 


ns 

ti400) 

I0-I7 Hold from CLK2a 

2 


2 


ns 

ti5(11) 

I0-I7 Valid to CS0#-CS3# Valid Delay-(tpD) 


10 


12 

ns 

tl6 

RDY # Input Setup to CLK2d (Write) 

7.5 


10 


ns 

tl7 

RDY# Input Setup to CLK2a (Read) 

9 


9 


ns 

tl8 

RDY# Input Hold after CLK2a (Read/Write) 

5 


10 


ns 


RESET High Setup to CLK2 1 

0 


0 


ns 


RESET High Hold from CLK2 1 

3 


3 


ns 

t2l(12) 

RESET Low Setup to CLK2a 

5 


5 


ns 


NOTES: 

6. Applies to ADS#, DEN#, W/R#, and AD0-AD3. DEN# is high during the entire Ta state in 80960 KA/KB systems. 

7. RDY# is an open-drain output. Specified time includes RDY# output float delay and pull-up/pull-down resistors 
(Figure 8). RDY# remains low for a minimum of 10 ns at the start of a Tr state and goes high by CLK2a of the next Tx state. 

8. Minimum WCLK# pulse width is one clock period minus 3 ns. For example, at 25 MHz: 20 ns — 3 ns = a 17 ns minimum 
WCLK# pulse. 

9. Chip Select Decoder latches are transparent flow-through types. Latches open when ADS# is low, DEN# is high, and 
CLK2 goes high during the middle of a Tx state (CLK2c). Since DEN# is high during the entire Ta state in 80960 KA/KB 
systems, only CLK2c and ADS# are specified. 

10. Chip Select Decoder latches are transparent flow-through types. Latches close when ADS# is high or DEN# is low, or 
when CLK2 goes high at the start of a Tx state (CLK2a) after the latches have opened. Since ADS# is low and DEN# is 
high at the end of a Ta in 80960 KA/KB systems, setup and hold times are specified with reference to CLK2a only. 

1 1 . Propagation delay while latches are open (transparent); one output switching (high-to-low). 

12. RESET must be held high for a minimum of 4 CLK2 cycles (80960 specifies 41 CLK2 cycles minimum). 

13. RESET must hold after the low-to-high transition immediately prior to CLK2a. CLK2a is defined as the first low-to-high 
transition after RESET goes low. 
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CLK2 EDGES 



CAPACITANCE (Ta = o^’Cto +70°C; Vcc = s.ov ± 5%) 


Symbol 

Parameter 

Min 

Typ 

Max 

Unit 

Conditions 

C|N 

Input Capacitance 


6 

10 

PF 

V|N = 0V,f = 1.0 MHz 

Gout 

Output Capacitance 


6 

10 

PF 

VouT = 0V,f = 1.0 MHz 

CcLK 

CLK2 Capacitance 


6 

10 

pF 

V|N = 0V,f = 1.0 MHz 

Cvpp 

Vpp Pin Capacitance 


10 

25 

pF 

Vpp on Pin 1 (RESET) 

Crdy 

RDY# Capacitance 


6 

10 

pF 

VouT = OV.f = 1.0 MHZ 
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4 Word Burst Write with 1 Wait State bn Each Access 
RDY # is Generated by the 85C960 
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27960CX 

PIPELINED BURST ACCESS 1M (128K x 8) CHMOS EPROM 



■ Synchronous 4 Byte Data Burst Access 

■ No Glue Interface to 80960CA 

■ High Performance Clock to Data Out 
— Zero Wait State Data to Data Burst 
— Up to 33 MHz 80960CA Performance 

■ Asynch Microcontroller Reset Function 
— Returns to Known State with High-Z 

Outputs 


■ Pipelined Addressing for Optimal Bus 
Bandwidth on 80960CA 

— Next Addressing Overlaps Last Data 
Byte 

■ CHMOS lll-E for High Performance and 

— 125 mA Active, 30 mA Standby 
— TTL Compatible Inputs 

m 1 Mbit Density Configures as 128K x 8 


Intel’s 27960CX is a 5V only, 1,048,576 bit. Erasable Programmable Read Only Memory, organized as 128K 
words of 8 bits. 


The 27960CX provides a no glue synchronous burst interface to the 80960CA bus. Internally the 27960CX is 
organized in 4 byte blocks, in which each byte is accessed sequentially. The Internal state machine is factory 
configured to generate either 1 or 2 wait-states between the address and first data byte. High performance 
outputs provide zero wait-state data to data accesses at clock frequencies up to 33 MHz. 

Pipelining capability allows addresses to overla p previo us data, further optimizing bus bandwidth in 80960CA 
applications. An asynchronous microcontroller RESET feature puts the outputs in the high Impedance state 
and takes the internal state machine to a known state where a new burst access can begin. 

The 27960CX is available in 44-lead PLCC package, providing optimum cost effectiveness. 

The 27960CX Is manufactured on Intel’s 1 micron CHMOS lll-E technology. The Quick-Pulse ProgrammirigTM 
algorithm provides fast, reliable programming with throughput under 17 seconds for optimized equipment. 

*CHMOS is a Patented Process of Intel Corporation. 



Figure 1. 27960CX Burst EPROM Block Diagram 
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27960CX BURST EPROM 

EPROMs are established as the preferred code stor- 
age device in embedded applications. The non-vola- 
tile, flexible, reliable, cost effective EPROM makes a 
product easier to design, manufacture and service. 
Until recently, however, EPROMs could not match 
the performance needs of high-end systems. The 
27960CX was designed to support the 80960CA em- 
bedded processor. It utilizes the burst interface to 
offer near zero wait-state performance without the 
high cost normally associated with this performance. 

In embedded designs, board space and cost must 
be kept at a minimum without impacting perform- 
ance and reliability. The 27960CX removes the need 
for expensive high-speed shadow RAM backed up 
by slow EPROM or ROM for non-volatile code stor- 
age. Code optimization concerns are reduced with 
“off-chip” code fetches no longer crippling to sys- 
tem performance. FONTs can be run directly out of 
these EPROMs at the same performance as high- 
speed DRAMS. With the 27960CX, the EPROM is 
the ideal code or FONT storage device for your 
80960CA system. 


■^CERQUAD is available in a socket only version. 


Architecture 

The 27960CX provides a no-glue, synchronous burst 
interface to the 80960CA’s bus. It operates in pipe- 
lined or non-pipelined modes. Internally, the 
27960CX is organized in 4 byte blocks which are 
accessed sequentiall y. A b urst access begins on the 
first clock pulse after ADS and CS are asserted. The 
address of the 4 byte b lock is latched on the rising 
edge of clock following ADS. After a preset number 
of wait-states (1 or 2), data Is output one byte at a 
time on each subsequent clock cycle. A burst ac- 
cess terminated on the rising edge of clock with 
BLAST asserted. High performance outputs provide 
zero wait-state data to data accesses at clock fre- 
quencies up to 33 MHz. Extra power and ground 
pins dedicated to the outputs reduce the effects of 
fast output switching on device performance. 

The pipelining capability of the 27960CX allows the 
address to overlap the last data byte of the burst, 
further optimizing bus band width in 80960CA appli- 
cations. In the pipelined mode, with a non-buffered 
interface, the 27960CX delivers 4 bytes of data in 
6 clock cycles at 33 MHz. In a 32-bit configuration, 
this translates into a read bandwidth of 88 Mbytes/ 
sec. Performance capability of the 27960CX In dif- 
ferent 80960CA systems Is given in Table I. 
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Figure 2. 27960CX Burst EPROM Signal Set 
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Table 1. Performance Capability 




33 MHz 2 WS Non-Buffered; 4 Words/6 Clock Cycles — 88 Mbytes/Sec 



ADDR 

Aoo 

WS 

WS 

— 

— 

— 

A 01 

WS 

WS 

— 

— 

— 

Ao2 

WS 

DATA 

— 

•— 

— 

Doo 

Doi 

Do2 

Doo 

— 

— 

D 10 

D 11 

Di2 

D 10 

— 

PCLK 

C1 

C2 

C3 

C4 

C 5 

Ce 

C 7 

• Ci 

Ca 

C 3 

C 4 

C 5 

Ce 

C1 



25 MHz 2 WS Buffered: 4 Words/6 Clock Cycles 

66 Mbytes/Sec 



auuR 

. 

>^00 



Wi) 



Wi> 




. 

^01 

— 

WO 



WO 




* 

^^02 

VVG 

DATA 

— 

— 

— 

Doo 

D 01 

Do2 

Doo 



D 10 

D 11 

Di2 

Di3 

— 

PCLK 

Cl 

Ca 

Co 

C4 

C 5 

Ce 

C 7 

C1 

Ca 

C 3 

C 4 

C 5 

Ce 

Ci 



16 MHz 1 WS Buffered: 4 Words/5 Clock Cycles — ► 

51 Mbytes/Sec 



ADDR 

Aoo 

WS 

— 

— 

— 

A 01 

WS 

— 

— 

— 

Ao2 

WS 



DATA 

— 

— 

Doo 

Doi 

Do2 

Doo 

— 

D 10 

D 11 

Di2 

Dio 

— 



PCLK 

Ci 

Ca 

C3 

C4 

C 5 

Ce 

Cl 

Ca 

Co 

C 4 

C 5 

Ci 
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PIN DESCRIPTIONS 


Symbol 

Pin 

Function 

A 0 -A 16 

23-39 

ADDRESS INPUTS: During a burst operation, A 2 -A 16 provides the 
base address pointing to a block of four consective bytes. Aq and Ai 
select the first byte of the burst access. The 27960CX latches 
addresses in the first clock cycle. An internal address generator 
increments addresses Aq and A-i for subsequent bytes of the burst. 

Do~D7 

18, 17, 14, 

13, 11, 10, 

7,6 

DATA INPUTS/OUTPUTS 

ADS 

42 

ADDRESS STROBE: Indicates the start of a new bus access. ADS is 
active low in the first clock cycle of a bus access. 

CS 

3 

CHIP SELECT: Master device enable. When asserted (active low) 
data can be written to and read from the device. In read mode, CS 
enables the state machine and the I/O circuitry. 

NOTE: 

1 . The address decode path is independent of CS, i.e., X and Y 
decoding is always powered up. 

2. For programming, CS should remain low for the entire cycle. 

Program and verify functions are done one byte at a time. 

3. ^ going high does not terminate a concurrent burst cycle. 

BLAST 

1 

BURST LAST: Terminates a concurrent burst data cycle at the rising 
edge of the CLK. It must be asserted by the fourth data byte. 

RESET 

22 

RESET: Resets the state machine into a known state, tri-states the 
outputs. RESET must be asserted for a minimum of 1 0 clock cycles. At 
least 5 clock cycles are required after deassertion of RESET before 
beginning the next cycle. RESET will abort a concurrent bus cycle. 

PGM 

43 

PROGRAM-PULSE CONTROL INPUT 

Vpp 

2 

PROGRAMMING POWER SUPPLY 

Vss 

5, 8, 12, 
15,19,21 

GROUND 

Vcc 

9, 16, 20. 44 

SUPPLY VOLTAGE INPUT 
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INTERFACE EXAMPLE 

Overview 

This example illustrates 8-, 16- and 32-bit wide 
27960CX interfaces to the 80960CA. The designs 
offer a simple “no-glue” interface. 

A non-buffered 27960CX system organized as 256K 
X 32 is shown in Figure 4A. Since the 27960CX is 
capable of driving a 80 pF load, large, non-buffered 
systems can be implemented by stacking up to 2 
banks of 4 EPROMs, resulting in a 256K x 32 memo- 
ry subsystem. The input capacitive load seen 


on the address lines (due to the EPROM only) Is 
24 pF for a 1 28K x 32 system and 48 pF for a 256K x 
32 system. The EPROM is specified at 6 pF for input 
capacitance (15 pF max) and 12 pF typical for out- 
put capacitance. Larger systems can be implement- 
ed with buffers (Figure 4B). 

Chip Select Logic 

High order address lines are decoded lo provide u5. 
Qualification with other signals is not required. The 
chip select logic can be implemented with standard 
asynchronous decoders, PAL’s or PLD’s (like Intel’s 
85C508). 



Figure 4A. 256K x 32 Non-Buffered Burst EPROM Memory System 
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Figure 4B. Buffered Burst EPROM Memory System 
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Schematics 

Figure 5 shows a non-buffered, 128K x 32 27960CX 
EPROM system. 

Chip select logic, the only external logic that is re- 
quired for this interface, can be derived from the 
global system chip select circuitry. 


In a non-buffered, 16-blt system (Figure 6A) BE1 
and A 2 conn ect to the lower order address bits of 
the 27960CX. BE1 connects to Aq of both EPROMs, 
while A 2 connects to both Ai ’s. 

In a non-buffered, 8-bit system (Figure 6B) BEO and 
BE1 connect to Aq and Ai respectively. 



290236-6 

Figure 5. 128K x 32 27960CX Burst EPROM System 
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Figure 6 A. 27960CX Burst EPROM in a 16-Bit System 


4-24 





27960CX 


lP[^iLDIi^OKl« 


iny. 



290236-8 


Figure 6B. 27960CX Burst EPROM In a 8-Bit System 


Waveforms 

Figure 7 shows the timing waveforms of a 27960CX 
pipelined read in a 32-bit system. 


required. With the 80960CA’s maximum valid ad- 
dress delay of 14 ns at 33 MHz, 9 ns remains for CS 
decoding logic. 



CS Setup Time 

^ setup time is the time between ^ being assert- 
ed and the first CLK rising edge (during the address 
cycle). Since a memo ry ac cess begins on the first 
CLK rising edge after ADS and CS are asserted, a 
minimum CS setup time of 7 ns (tsvch) at 33 MHz is 


Bootup 

The wait state configuration (1 or 2), of the 27960CX 
Is programmed by the user Into the 80960CA Region 
Table parameters of NRAD, NRDD, and NXDA. 
NRDD is always 0 for the 27960CX. 
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Figure 7. Two Cycles of a 27960CX 2 Wait State 4 Byte Read (2-0-0-0 Burst Read) in a 32 Bit System 


During boot-up (Figure 8), the 80960CA picks up it’s 
Region Table data from addresses FFFF FFOO; 
FFFF FF04; FFFF FF08 and FFFF FFOC. Only the 
least significant byte of each of the above four 32-bit 
accesses is used to configure the Region Table. For 
boot-up, the wait-state parameters NRAD and NXDA 
default to 31 and 3 respectively. During boot-up, the 
27960CX will wrap around the first word o f the fo ur- 
word burst and hold the first word until BLAST is 
asserted. 


27960CX DEVICE NAMES 

The device names on the 27960CX were derived as 
mnemonics that correspond to the number of wait 
states and expected operating frequency for the de- 
vice. For example, the 25 MHz, 2 wait state 
27960CX is named 27960C2-25. 


AC TIMING DERIVATIONS 

The AC timings for the 27960CX were generated 
specifically to meet the requirements of the 
80960CA microprocessor. In each case the applica- 
ble 80960CA clock frequency and AC timing were 
taken together with an address buffer delay (if need- 
ed) and a typical 2 ns guardband to generate the 
27960CX AC timing. Worst case timings were 


always assumed. On timings where the EPROM is 
faster than the microprocessor, we specified the 
time required by the EPROM and left the excess 
time as additional system guardband. The example 
below shows how the 27960C2-33 tavcoh timing 
was derived. 

@33 MHz the clock cycle is 30 ns. 

tov 2 0^ fhe 80960CA is 3 ns - 14 ns. 

Typical 2 ns guardband. 

27960C2-33 tavcoh = 30 ns - 1 4 ns - 2 ns 
= 14 ns 

Decoders are needed for the systems chip select 
decoding. For the 27960CX timings we assumed a 
10 ns chip select decoder for 16 MHz and a 7 ns 
decoder for 25 MHz and 33 MHz systems. The ex- 
ample below shows how the 27960C2-33 tsvch tim- 
ing was derived. 

@33 MHz the clock cycle is —30 ns. 

tov2 fhe 80960CA is 3 ns - 1 4 ns. 

Decoder = 7 ns 

27960C2-33 tsvch = 30 ns - 14 ns — 7 ns 
= 9 ns 
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System Buffering Considerations 

For large system applications buffering may be re- 
quired between the microprocessor and memory de- 
vices. The 25 and 16 MHz 27960CX AC timings take 
this into account. For applications not requiring buff- 
ering these devices will provide additional system 
guardband. 

The list below shows the buffers used in generating 
the 27960CX timings; 



Input 

Output 


Buffer 

Buffer 

25 MHz 

8 ns 

5 ns 

16 MHz 

10 ns 

7 ns 


Note that the 25 MHz buffers are slightly faster In 
keeping with the increased sensitivity for higher per- 
formance. Significantly faster buffers are available 
for applications requiring them. The example below 
shows the tchqv timing analysis for a buffered 
27960C2-25. 

@25 MHz the clock cycle is ~40 ns. 

tiHi of the 80960CA is 5 ns. 

Output buffer for 25 MHz = 5 ns 

27960C2-25 tcHQV = 40 ns - 5 ns - 5 ns 
= 30 ns 
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ABSOLUTE MAXIMUM RATINGS* 

Read Operating Temperature 0®C to + 70®C(8) 

Case T emperature Under Bias . . - 1 0®C to + 80^0(8) 
Storage Temperature -65®C to + 125°C 

All Input or Output Voltages 
with Respect to Ground -0.6V to +6.5V(4) 

Voltage on Ag 

wiifi Respeui lu Gruuiiu — G.5V lu -r IS.OVx-) 

Vpp Supply Voltage 

with Respect to Ground -0.6V to + 14.0V(4) 

Vcc Supply Voltage 

with Respect to Ground -0.6V to -f7.0V(4) 


NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended expeeure beyen'^ 
may affect device reliability. 


READ OPERATION 


DC CHARACTERISTICS 0°C < Ta +70"C,Vcc = 5V ±10%,TTL inputs 


Symbol 

Parameter 

Notes 

Min 

Max 

Unit 

Test Condition 

lu 

Input Load Current 



1 

jliA 

V|N = 5.5V 

Ilo 

Output Leakage Current 



10 

fjiA 

VeUT 5.5V 

ipp 

Vpp Load Current Read 



10 

fxA 

Vpp = 0 to Vcc> PGM = V||-| 

ISB 

Vcc Standby 

Switching 

2 


45 

mA 

CS = V|H. f = 33 MHz 

Stable 

2 


30 

mA 

CS = V|H 

•cc 

Vcc Active Current 

1,3,7 


125 

mA 

CS = V|L, f = 33 MHz. 

Iqut = 0 

D9H 


4 

-0.5 

0.8 



V|H 

Input High Voltage 


2.0 

Vcc + 1 

V 


VoL 

Output Low Voltage 



0.45 

V 

Iql = 2.1 mA 

VoH 

Output High Voltage 

5 

< 

o 

o 

I 

o 

bo 


V 

Iqh = — 100 jaA 

Iqh = -400 jllA 

5 

2.4 


V 

•os 

Output Short Circuit 

6 


100 

mA 



NOTES: 

1 . Maximum current is with outputs unloaded. 

2. Ice standby current assumes no output loading i.e., Iqh = Iql = 8 niA. 

3. Ice is the sum of current through Vcea + Vce 4 and does not include the current through Vcei and Vcea- (Veei and 
Vee2 supply power to the output drivers. Vcea and Vce 4 supply power to the reset of the device.) 

4. Minimum DC input voltage on input and output pins is -0.5V. During transitions, this level may undershoot to -2.0V for 
periods less than 20 ns. 

5. Maximum DC voltage on input and output pins is Vee + 0-5V which may overshoot to Vee + 2.0V for periods less than 
20 ns. 

6. One output shorted for no more than one second. Iqs is sampled but not 100% tested. 

7. Ice ^ax measured with a 10. 11 juiF capacitor between Vcc and Vss- 

8. This specification defines commercial product operating temperatures. 
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EXPLANATION OF AC SYMBOLS 

The nomenclature used for timing parameters are as 
per IEEE STD 662-1980 IEEE Standard Terminology 
for Semiconductor Memory. 

Each timing symbol has five characters. The first Is 
always a “t” (for time). The sec ond c haracter repre- 
sents a signal name, e.g., (CLK, ADS, etc.). The third 
character represents the signal’s level (high or low) 
for the signal indicated by the second character. The 
fourth character represents a signal name at which a 
transition occurs marking the end of the time interval 
being specified. 


AC CHARACTERISTICS: READ OPERATION O'^C < Ta < +70“C. Vcc = 5V ±10% 


Versions 

27960C2-33 

27960C2-25 

27960C1-16 

Unit 

33 MHz 

2 Wait State 

25 MHz 

2 Wait State 

16 MHz 

1 Wait State 

No. 

Symbol 

Parameter 

Notes 

Min 

Max 

Min 

Max 

Min 

Max 

1 

tAVCoH 

Address Valid to 

CLK High 

CLKo 

12 


10 


22 


ns 

2 

fCNHAX 

CLK High to 

Address Invalid 

2 

0 


0 


0 


ns 

3 

fLLCH 

ADS low to CLK High 

CLKo 

8 


8 


22 


ns 

4 

fCHLH 

CLK high to ADS High 

5 

6 

22 

6 

32 

6 

40 

ns 

5 

tSVCH 

CS Valid to 

CLK High 

1 

7 


7 


14 . 


ns 

6 

tCNHSX 

CLK High to 

CS Invalid 

2 

0 


0 


0 


ns 

7 

fCHQV 

CLK High to Data Valid 

7 


27 


30 


40 

ns 

8 

tCHQX 

CLK High to Data Invalid 


5 


5 


5 


ns 

9 

tCHQZ 

CLK High to Data High Z 

6 


25 


30 


30 

ns 

10 

tBVCH 

BLAST Valid to 

CLK High 


8 


8 


22 


ns 

11 

fCHBX 

CLK High to 

BLAST Invalid 

3 

5 

22 

5 

32 

5 

40 

ns 


The fifth character represents the signal level indi- 
cated for the fourth character. The list below shows 
character representations. 


A: 

Address 

R: 

Reset 

B: 

BLAST 

Q: 

Data 

C: 

Clock 

S: 


H: 

Logic High Level 

t: 

Time 

L: 

ADS/ Logic Low Level 

V: 

Valid 

P; 

Vpp Programming Voltage 

Z: 

Tri-state Level 


X: No longer a valid “driven” logic level 


NOTES: 

1 . Valid signal level is meant to be either a logic high or logic low. 

2. The subscript N represents the number of wait states for this parameter. CS can be de-asserted (high) after the number 
of wait states (N) has expired and the EPROM will continue to burst out data for the current cycle. 

3. BLAST # must be returned high before the next rising clock edge. 

4. The sum of tcHQV + tAVCH + Nclk will not equal actual tAVQV if independent test conditions are used to obtain tAvcH 
an d tcH QV (N = number of wait states). 

5. ADS must be returned high before the next rising clock edge. 

6. Sampled, not 100% tested. The transition is measured ±500 mV from steady state voltage. 

7. For capacitive loads above 80 pF, tcHQV can be derated by 1 ns/20 pF. 
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Figure 10. 27960CX Pipelined 2 Wait State AC Waveforms 
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AC CONDITIONS OF TEST 

Input Rise and Fall Times 

(1 0% to 90%) 4 ns Input Timing Reference Level 1 .5V 

Input Pulse Levels 0.45V to 2.4V Output Timing Reference Level 1 .5V 


Table 2. Mode Table 


Mode 

CS 

PGM 

BLAST 

ADS 

RESET 

Ag 

Vpp 

Vcc 

OUTPUT 

Read 

V|L 

V|H 

V|h(1) 

V|h(2) 

V|H 

X 

Vcc 

Vcc 

DquT 

Standby(6) 

V|H 

X 

X 

X 

V|H 

X 

Vcc (5) 

Vcc 

HighZ 

Program 

V|L 




V|H 

X 

(3) 

(3) 

D|N 

Program Verify 

V|L 




V|H 

X 

(3) 

(3) 

Dqut 

Program Inhibit 

V|H 

X 

X 

X 

V|H 

X 

(3) 

(3) 

HighZ 

ID Byte 0: Manufacturer 

V|L 

V|H 

V|h(1) 

V|h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

89H 

ID Byte 1: Part (27960) 

V|L 

V|H 

V|h(1) 

V|h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

EOH 

ID Byte 2: CX 

V|L 

V|H 

V|h(1) 

V,h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

01B 

ID Byte 3; 1 Wait State 

2 Wait States 

V|L 

V|H 

V|h(1) 

V|h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

01B 

10B 

Reset 

X 

X 

X 

X 

V|L 

X 

Vcc 

Vcc 

HighZ 


NOTES: 

1. V|H until data terminated at which time BLAST must go to V|l. 

2 . Need to toggle from VjH to V|l to V|h. 

3. See DC Programming Characteristics for Vcc. V|d and Vpp voltages. 

4. X can be V|l or Vm- 

5. Vpp = Vcc to meet standy current specification. Vcc > V pp > V| l will ca use a slight increase in standby current. 

6. The device must be in the idle state (by asserting RESET or using BLAST) before going into standby. 


CAPACITANCE(I) Ta = 25‘’C, f = 1.0 MHz 


Symbol 

Parameter 

Typ 

Max 

Unit 

Condition 

C|N 

Input Capacitance 

4 

6 

PF 

> 

o 

II 

z 

> 

Gout 

Output Capacitance 

12 

15 

PF 

VoUT = OV 

Cvpp 

Vpp Capacitance 

40 

45 

pF 

< 

z 

II 

o 

< 


NOTE: 

1. Sampled. Not 100% tested. 
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AC INPUT/OUTPUT REFERENCE WAVEFORMS 



Input and output timings are measured from 1.5V. 
Timing values are specified assuming maximum input 
and output rise and fall time = 4 ns. 


AC TESTING LOAD CIRCUIT 


2.1V 


DEVICE 

UNDER 

TEST 


^ 780X1 


CL = 80 


pF 


290236-15 

CL includes jig capacitance 

For tQHQz Cl = 5 pF and Rl = 40511 


CLOCK CHARACTERISTICS 


Versions 

33 MHz 

25 MHz 

20 MHz 

16 MHz 

Units 

Symbol 

Parameter 

Min 

Max 

Min 

Max 

Min 

Max 

Min 

Max 

CLK 

Period 

30.3 


40 


50 


62.5 


ns 

tpR 

Rise Time 

1 

4 

1 

4 

1 

4 

1 

4 

ns 

tpF 

Fall Time 

1 

4 

1 

4 

1 

4 

1 

4 

ns 

tpL 

Low Time 

(t/2) - 2 

t/2 

(t/2) - 3 

t/2 

(t/2) - 4 

t/2 

(t/2) - 4 


■a 

tpH 

High Time 

(t/2) - 2 

t/2 

(t/2) - 3 

t/2 

(t/2) - 4 

t/2 

(t/2) - 4 


■a 


Max Rise Time for Programming CLK = 100 ns 


CLOCK WAVEFORM 
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Program/Program Verify 

Initially, and after each erasure, all bits of the 
EPROM are in the “1’s” state. Data is introduced by 
selectively programming “O’s” into the desired bit 
locations. Although only “O’s” can be programmed, 
both “1’s” and “O’s” can be present In the data 
word. Ultraviolet erasure Is the only way to change 
“O’s” to “1’s”. 

Programming mode is entered when Vpp is raised to 
12.75V. Program /Verify operation is synchronous 
with the clock and can only be initiated following an 
Idle state. Program and Program Verify take place in 
3 clock cycles. In the first clock cycle, addresses 
and data are input and programming occurs. Pro- 
gram Verify follows in the second clock cycle and 
the third clock cycle terminates synchronous Pro- 
gram/Verify operation, returning the state machine 
to the Idle state with outputs at high Impedance. 

As in the Read mode, A 2 -A 16 point to a four byte 
block in the memory array. During programming, the 
internal address increment circuitry is disabled and 
the programmer must supply Aq and Ai to point to 
an individual byte within the four byte block that is to 
be programmed. Only one byte is programmed in 
each 3 cycle Program/Verify sequence. 


Program Inhibit 

The Program Inhibit mode allows parallel program- 
ming and verification of multiple devices with differ- 
ent data. With Vpp at 12.75V, a Program/Verify se- 
quence is initiated for any device that receives a val- 
id ADS p ulse a nd rising clock edge while CS is as- 
serted. A PGM pulse programs data in the first cycle 
of the sequence and data for Program Verify Is out- 
put in the second cycle. The Program/Verify se- 
quence is inhibited on any devices for which CS Is 
not asserted. Data will not be programmed and the 
outputs will remain in their high impedance state. 


inteligent Identifier™ Mode 

The device’s manufacturer, product type, and con- 
figuration are stored in a four byte block that can be 
accessed by using the intgligent IdentifisrTM mode. 


The programmer can verify the device identifier and 
choose the programming algorithm that corresponds 
to the Intel 27960CX. The inteligent Identifier can 
also be used to verify that the product is configured 
with the desired Read mode options for wait states. 

inteligent Identifier mode is entered when Ag (pin 32) 
is raised to its high voltage (Vjd) level. The Internal 
state machine is then set for intelligent Identifier 
Read operation. Reading the identifier is similar to a 
Read operation on a one wait state configured prod- 
uct. Up to four bytes can be read in a single burst 
access, inte ligent Id entifier read is terminated by a 
synchronous BLAST input, returning the state ma- 
chine to the idle state with outputs at high imped- 
ance. 

The four byte block code for the inteligent Identifier 
code is located at address OOH through 03H and Is 
encoded as follows: 


MEANING 

(A1,A0) 

DATA 

Intel ID 

Byte 00 

89h 

27960 

Byte 01 

EOh 

CX 

Byte 10 

01b 

1 Wait State 

Byte 1 1 

01b 

2 Wait States 

Byte 1 1 

10b 


RESET MODE 

Due to the synchronous nature of the 27960CX, the 
various operating modes must be initiated from a 
known Idle state. During normal operation, the inter- 
nal state machine returns to an idle sta te at the ter- 
mination of a bus access (after BLAST is asserted). 

During initial device power up, the state machine is 
In an indeterminant state. The reset mode is provid- 
ed to force operation into the Idle state. Reset mode 
is entered when the RESET pin is asserted. Output 
pins are asynchronously set to the high impedance 
state and address latches are put into the flow 
through mode. A reset is successfully completed 
and the state machine set in an idle state when 
RESET has been asserted for a minimum of 10 
clock cycles and deasserted for five clock cycles. 
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Figure 11. Quick-Pulse Programmingi'M Algorithm 
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QUICK-PULSE PROGRAMMINGTM 
ALGORITHM 

The Quick-Pulse Programming algorithm programs 
Intel’s 27960CX. Developed to substantially reduce 
programming throughput time, this algorithm allows 
optimized equipment to program a 27960CX in un- 
der 17 seconds. Actual programming time depends 
on the programmer used. 

The Quick-Pulse Programming algorithm uses a 
100 juts pulse followed by a byte verification to deter- 


mine when the addressed byte is correctly pro- 
grammed. The algorithm terminates if 25 100 fis 
pulses fail to program a byte. Figure 1 1 shows the 
27960CX Quick-Pulse Programming algorithm flow- 
chart. 

The entire program-pulse/byte-verify sequence is 
performed with Vcc = 6.25V and Vpp = 12.75V. 
The program equipment must establish Vcc before 
applying voltages to any other pins. When program- 
ming is complete, all bytes should be compared to 
the original data with Vcc = 5.0V and Vpp = 
12.75V. 


D.C. PROGRAMMING CHARACTERISTICS Ta = 25“ ±5“C 


Symbol 

Parameter 

Notes 

Min 

Max 

Unit 

Condition 

lu 

Input Load Current 



10 

jliA 

V|N = V|HOrV|L 

•cc 

Vcc Program Current 

1 


125 

mA 

_J 

> 

II 

|CO 

lo 

Ipp 

Vpp Program Current 

1 


50 

mA 

> 

II 

|CO 

lo 

V|L 

Input Low Voltage 


-0.5 

0.8 

V 


V|H 

Input High Voltage 


2.0 

Vcc + 0.5 

V 


VoL 

Qutput Low Voltage(Verify) 



0.40 

V 

Iql = 2.1 mA 

VOH 

Output High Voltage(Verify) 


GO 

d 

1 

8 

> 


V 

Iqh = -400 julA 

V|D 

Ag inteligent Identifier 

Voltage 


11.5 

12.5 

V 


Vcc 

Supply Voltage (Program) 

2 

6.0 

6.5 

V 


Vpp 

Program Voltage 

2 

12.5 

13.0 

V 



NOTES: 

1. The maximium current value is with outputs unloaded. 

2. Vcc iT^ust be applied simultaneously or before Vpp and removed simultaneously or after Vpp. 

3. During programming clock levels are Vm and V|l. 
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A.C. PROGRAMMING, RESET AND ID CHARACTERISTICS Ta = 25»C ±5“C 


No. 

Symbol 

Parameter 

Notes 

Min 

Max 

Unit 

1 

tAVPL 

Address Valid to PGM Low 


2 


JULS 

2 

tCHAX 

CLK High to Address Invalid 


50 


ns 

3 

tLLCH 

ADS Low to CLK High 

1 

50 


ns 

4 

tCHLH 

CLK High to ADS High 

2 

50 


ns 

5 

tSVCH 

^ Valid to CLK High 


50 


ns 

6 

tCHSX 

CLK High to CS Invalid ' 

3 



ns 

7 

tCHQV 

CLK High to Dqut Valid 


100 


ns 

8 

tCHQX 

CLK High to Dqut Invalid 


0 


ns 

9 

tevcH 

BLAST Valid to CLK High 


50 


ns 

10 

tCHBX 

CLK High to BLAST Invalid 

4 

50 


ns 

11 

tQVPL 

DATA Valid to PGM Low 


2 


jULS 

12 

tpLPH 

PGM Program Pulse Width 


95 

105 

fJLS 

13 

tpHQX 

PGM High to D|n Invalid 


2 


JLlS 

14 

tCLPL 

CLK Low to PGM Low 


50 


ns 

15 

tQZCH 

D|n Tri-State to CLK High 


2 


JLlS 

16 

tvcs 

Vcc Program Voltage to CLK High 

7 

2 


fXS 

17 

tVPS 

Vpp Program Voltage to CLK High 

7 

2 


JLtS 

18 

UgHCH 

Ag V|D Voltage to CLK High 


2 


jLtS 

19 

tCHAgX 

CLK High to Ag Not V|d Voltage 


2 


fXS 

20 

tRVCH 

RESET Valid to CLK High 

6 

50 


ns 

21 

tCHCL 

CLK High to CLK Low 

5 

100 


ns 

22 

tCLCH 

CLK Low to CLK High 

5 

100 


ns 


1 . If CS is low, ADS can go low no sooner than the falling edge of the previous CLK. 

2. must return high prior to the next rising edge of clock. 

3. CS mus t remain low until after the rising edge of CLK1. 

4. BLAST must return high prior to the next rising edge of CLK. 

5. Max CL K rise/fall time is 100 ns. 

6. RESET must be low for 1 0 clock cycles and high for 5 clock cycles. 

7. Vcc niust be applied simultaneously or before Vpp and removed simultaneously or after Vpp. 
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27960KX 

BURST ACCESS 1M (128K x 8) CHMOS EPROM 



■ Synchronous 4-Byte Data Burst Access 

■ Simple Interface to the 80960KA/KB 

■ High Performance Clock to Data Out 
— Zero Wait State Data-to-Data Burst 
— Supports 16, 20 and 25 MHz 

80960KA/KB Devices 


■ Asynch Microcontroller Reset Function 
— Returns to Known State with High Z 

Outputs 

■ CHMOS'' lll-E for High Performance and 
Low Power 

— 125 mA Active, 30 mA Standby 
— TTL Compatible Inputs 


■ 1 Mbit Density Configures as 128K x 8 

Intel’s 27960KX is a 5V only, 1,048,576 bit. Erasable Programmable Read Only Memory, organized as 128K 
words of 8 bits. 


The 27960KX provides a simple synchronous burst interface to the 80960KA/KB bus. Internally the 27960KX 
Is organized in 4 byte blocks, in which each byte Is accessed sequentially. The Internal state machine Is factory 
configured to generate either 1 or 2 wait-states between the address and first data byte. High performance 
outputs provide zero wait-state data to data accesses at clock frequencies up to 25 MHz. 


An asynchronous microcontroller RESET feature puts the outputs in the high Impedance state and takes the 
Internal state machine to a known state where a new burst access can begin. 

The 27960KX Is available in 44 lead PLCC package, providing optimum cost effectiveness. 

The 27960KX Is manufactured on Intel’s 1 micron CHMOS lll-E technology. The Quick-Pulse ProgrammIngTM 
algorithm provides fast, reliable programming with throughput under 1 7 seconds for optimized equipment. 

*CHMOS is a patented process of Intel Corporation. 



Figure 1. 27960KX Burst EPROM Block Diagram 
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27960KX BURST EPROM Architecture 

EPROMs are established as the preferred code stor- The 27960KX provides a simple, synchronous burst 

age device in embedded applications. The non-vola- interface to the 80960KA/KB’s bus. Internally, the 

tile, flexible, reliable, cost effective EPROM makes a 27960KX is organized in 4 byte blocks each byte is 

product easier to design, manufacture and service. accessed sequentially. A burst access begins on the 

Until recently, however, EPROMs could not match first clock pulse after CS is asserted. The address of 

the performance needs of high-end systems. The the f our byte block is latched by the rising edge of 

27960KX was designed to support the 80960KA/KB ALE. After a preset number of wait-states (1 or 2), 

embedded processor. It utilizes the burst interface to data is output one byte at a time on each subse- 

offer near zero-wait state performance without the quent clock cycle. A burst access is terminated on 

high cost normally associated with this performance. the rising edge of CLOCK if BLAST is asserted. High 

performance outputs provide zero wait-state data to 
In embedded designs, board space and cost must data accesses at clock frequencies up to 25 MHz. 

be kept at a minimum without impacting perform- Extra power and ground pins dedicated to the out- 

ance and reliability. The 27960KX removes the need puts reduce the effects of fast output switching on 

for expensive high-speed shadow RAM backed up device performance, 

by slow EPROM or ROM for non-volatile code stor- 
age. Code optimization concerns are reduced with The 27960KX delivers 4 bytes of data in 8 clock 

“off-chip” code fetches no longer crippling to sys- cycles at 25 MHz and 4 bytes of data in 7 clock 

tern performance. FONTs can be run directly out of cycles at 20 MHz. In a 32-bit configuration, this 

these EPROMs at the same performance as high- translates into a read bandwidth of 50 Mbytes/sec 

speed DRAMs. With the 27960KX, the EPROM is and 45 Mbytes/sec respectively. Performance capa- 

the ideal code or FONT storage device for your bility of the 27960KX In different 80960KA/KB sys- 

80960KA/KB system. terns is given in Table 1. 



Figure 2. 27960KX Burst EPROM Signal Set 
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Figure 3. 27960KX 44-Lead PLCC Pinout 


PIN DESCRIPTIONS 



Pin 

Function 

Ao-Aie:* 

23-39 

ADDRESS INPUTS: During a burst operation, A 2 through A 16 provide the base 
address pointing to a block of four consecutive bytes. Aq and Ai select the first 
byte of the burst access. The 27960KX latches valid addresses in the first clock 
cycle. An internal address generator increments addresses Aq and Ai for 
subsequent bytes of the burst. 

D 0 -D 7 : 

18, 17, 14, 13, 
11, 10, 7,6 

DATA INPUTS/OUTPUTS 

ALE 

42 

ADDRESS LATCH ENABLE: Indicates the transfer of a physical address. ALE 
is an active low signal used to latch the addresses from the processor. 

Addresses are latched on the rising edge of ALE. Valid addresses must be 
present at or before ALE becomes valid. 

CS 

3 

CHIP SELECT: Master device enable. When asserted (active low) data can be 
written to and read from the device. In read mode, CS enables the state 
machine and the I/O circuitry. 

NOTES: 

1 . The address decode path is independent of CS, i.e., X and Y decoding Is 
always powered up. 

2. For programming, CS should remain low for the entire cycle. Program and 
verify functions are done one byte at a time. 

3. CS going high does not terminate a concurrent burst cycle. 

4. CS must be deasserted between bursts. 

BLAST 

1 

BURST LAST: Terminates a concurrent burst data cycle at the rising edge of the 
CLK. It must be asserted by the fourth data byte. 

RESET 

22 

RESET: Resets the state machine into a known state, tri-states the outputs. The 
duration of RESET should be 10 CLK cycles minimum. At least 5 clock cycles 
are required after deassertion of RESET before beginning the next cycle. Reset 
will abort a concurrent bus cycle. 
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PIN DESCRIPTIONS (Continued) 


Symbol 

Pin 

Function 

PGM 

43 

PROGRAM-PULSE CONTROL INPUT 

Vpp 

2 

PROGRAMMING POWER SUPPLY Vpp 

Vss 

5, 8, 12, 

15, 19,21 

GROUND 

Vcc 

9, 16. 20, 44 

SUPPLY VOLTAGE INPUT 


Table 1. Performance Capability 


25/20 MHz 

2 WS 

NON-BUFFERED 

: 4 WORDS/8 CLOCK 

CYCLES 50/40 MBYTES/SEC 

ADDR 

Aqo 

WS 

WS 

- 

- 

- 

- 

RS 

A01 

WS 

WS 


- 

- 

- 

RS 

DATA 

- 

- 

- 

Dqo 

Dot 

Do2 

Do 3 

- 

- 

- 

- 

D10 

D11 

Di2 

Cl 3 


CLK 

Cl 

C2 

C3 

C4 

C5 

Ce 

C7 

Ce 

C1 

C2 

C3 

C4 

Cs 

Ce 

C7 

Cs 

20 MHz 1 WS NON-BUFFERED : 4 

WORDS /7 

CLOCK CYCLES 


45 MBYTES/SEC 


ADDR 

Aqo 

WS 

- 

- 

- 

- 

RS 

A01 

WS 

- 

- 

- 

- 

RS 

Ao 3 

WS 

DATA 

- 

- 

Dqo 

Dot 

Do2 

Do 3 

- 

- 

- 

D10 

Dll 

Di2 

Di 3 




CLK 

C1 ' 

C2 

C3 

C4 

C5 

Ce 

C7 

C1 

C2 

C3 

C4 

Cs 

Ce 

C7 



16 MHz 1 WS BUFFERED : 4 

WORDS /7 CLOCK CYCLES 

36 MBYTES/SEC 



ADDR 

Aqo 

WS 

- 

- 

- 

- 

RS 

A01 

WS 

- 

- 

- 

- 

RS 

Ao 3 

WS 

DATA 

- 

- 

Doo 

Dot 

Do2 

Do3 

- 

- 

- 

D 10 

D 11 

Di2 

Di3 




CLK 

Cl 

C 2 

C 3 

C 4 

C 5 

Ce 

C 7 

Ci 

C 2 

C 3 

C 4 

C 5 

Ce 

C 7 




INTERFACE EXAMPLE 

Overview 

The following design offers a simple Interface to the 
80960KA/KB’s bus. 

A non-buffered 27960 KX burst EPROM system is 
shown in Figure 4. Since the 27960KX Is capable of 
driving a 120 pF load, large, non-buffered systems 
can be implemented by stacking up to 2 banks of 4 
EPROMs, giving a memory size of 256K x 32. The 
Input capacitive load seen on the address lines (due 
to the EPROM only) is 24 pF for a 128K x 32 


system (shown) and 48 pF for a 256K x 32 system. 
The EPROM Is specified at 4 pF for Input capaci- 
tance and 12 pF typical for output capacitance. 
Larger systems can be implemented with buffers. 

Chip Select Logic 

High order address lines are decoded to provide CS. 
Qualification with other signals is not required. The 
chip select logic can be implemented with standard 
asynchronous decoders, PAL’s or PLD’s (like Intel’s 
85C960). 
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NOTE: 

27960KX does not require address latches 


Figure 4. 128K x 32 Burst EPROM System 


Waveforms 

Figure 5 shows the timing waveforms of 27960KX 
reads in a 32-bit system. 


CS Deassert between bursts 

After every EPROM read (one to four words) ^ 
must be deasserted. 


CS setup time 

^ setup time is the time between CS asserted and 
the first rising CLK edge of CLK (during the address 
cycle). Since a memory access begins on the first 
CLK rising edge after CS asserted, a minimum CS 
setup time of 5 ns (tsvch) at 25 MHz is required. 
With the 80960KA/KB’s maximum valid address de- 
lay of 18 ns at 25 MHz, 13 ns remains for CS decod- 
ing logic. 


Reset and RESET 

The 27960KX uses RESET. The 80960 KA/KB 
RESET signal must be inverted for the 27960 KX. 

Clock Phase 

The initial rising edge of CLK and CLK2 must be in 
phase with as small a skew as possible. 


4-44 








27960KX 




int^. 


A WS D D D D RC A WS D D D D RC 

CLK 



NOTES: 

1 . 1 -0-0-0 Burst Read — > 1 indicates the number of wait states to access the first word 

O’s indicate the number of w ait states for subsequent data words (0 in this case) 

2. 27960KX latches addresses on the rising edge of ALE: it has an internal address generator which increments ad- 
dresses for subsequent words of the burst. 


Figure 5. Two Cycles of a 27960KX 1 Wait State, 4-Byte Read (1 -0-0-0 Burst Read) in a 32-Bit System 


27960KX DEVICE NAMES 

The device names on the 27960KX were derived as 
mnemonics that correspond to the number of wait 
states and expected operating frequency for the de- 
vice. For example, the 25 MHz, 2 wait state 
27960KX is named 27960K2-25. 


AC TIMING DERIVATIONS 

The AC timings for the 27960KX were generated 
specifically to meet the requirements of the 
80960KA/KB microprocessor. In each case the ap- 
plicable 80960KA/KB clock frequency and AC tim- 
ing were taken together with an address buffer delay 
(if needed) and a 4 ns positive clock skew or a 2 ns 
negative clock skew (see Figure 6A) guardband to 


generate the 27960KX AC timing. Worst case tim- 
ings were always assumed. The example below 
shows how the 27960K1-20 tavcgh timing was de- 
rived. 



@20 MHz the clock cycle is ~ 50 ns. 
te of the 80960KA/KB is 2-20 ns. 

4 ns clock skew guardband. 

27960K1-20 tavcgh = 50 ns — 20 ns - 4 ns 
= 26 ns 


On timings such as this, where the EPROM is faster 
than the microprocessor, we specified the EPROM’s 
timing leaving the excess time as system guard- 
band. 
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NOTE: 

The 27960KX allows a positive clock skew (CLK2 leading CLK) of up to 4 ns and a negative clock skew (CLK2 lagging 
CLK) of up to 2 ns. The larger positive clock skew takes into account longer trace lengths and heavier loading on the lx 
clock trace. 


Figure 6A. Definition of Positive and Negative Clock Skew 



290237-12 

NOTE: 

CLK and CLK2 are generated by the same PAL. This minimizes skew between CLK and CLK2. Both PAL outputs are fed 
to a 74F244 driver. The EPROMs should be as close to the clock driver as possible. 


Figure 6B. Example Clock Circuit with Minimum Skew 
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NOTE: 

This clock generation circuit uses a 100 MHz oscillator. The EPROMs should be as close to the NAND drivers as 
possible. 


Figure 6C. Example Clock Circuit Using a 100 MHz Oscillator 


Decoders are needed for the systems address (chip 
select) decoding. For the 27960KX’s timings we as- 
sumed a 5-10 ns chip select decoder for 16 MHz 
and 20 MHz frequencies and a 5-9 ns decoder for 
25 MHz systems. The example below shows how 
the 27960K2-25 tsvch timing was derived. 

@25 MHz the clock cycle is —40 ns. 
te of the 80960KA/KB is 2-18 ns. 

Decoder = 9 ns 
4 ns clock skew guardband 

27960K2-25 tsvch = 40 ns - 18 ns - 9 ns - 4 ns 
= 9 ns 


SYSTEM BUFFERING CONSIDERATIONS 

For many large system applications buffering may 
be required between the microprocessor and memo- 
ry devices. The 20 MHz - 2 WS and 16 MHz 
27960KX AC timings take this into account. For ap- 
plications at these frequencies not requiring buffer- 
ing these devices will provide an additional 5-10 ns 
of system guardband. 


The list below shows the buffers used In generating 
these timings: 



Input 

Output 


Buffer 

Buffer 

20 MHz 

9 ns 

5 ns 

16 MHz 

10 ns 

7 ns 


The 20 MHz buffers are slightly faster in keeping 
with the increased sensitivity for higher perform- 
ance. We chose the above buffers because of their 
wide availability. Significantly faster buffers are avail- 
able for applications requiring them. The example 
below shows tchqv for the 27960K2-20. 

@20 MHz the clock cycle is — 50 ns. 
tio of the 80960KA/KB is 3 ns. 

Output buffer for 20 MHz = 5 ns. 

4 ns clock skew guardband 

27960K2-20 tchqv = 50 ns - 5 ns - 3 ns - 4 ns 
= 38 ns 
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ABSOLUTE MAXIMUM RATINGS* 

Read Operating Temperature 0°C to + 70®C(8) 

Case Temperature under Bias . . - 1 0“C to + 80°C(8) 

Storage Temperature -65®C to + 1 25®C 

All Input or Output Voltages -0.6V to + 6.5V(4) 

with Respect to Ground 

Voltage on Ag - 0.6V to + 1 3.0V(4) 

with Respect to Ground 

Vpp Supply Voltage -0.6V to + 14.0V(4) 

with Respect to Ground 

Vcc Supply Voltage -0.6V to + 7.0V(4) 

with Respect to Ground 


NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 

* WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


DC CHARACTERISTICS: READ OPERATION 

O'^C < Ta < +70'’C, Vcc = 5V ± 10%, TTL Inputs 


Symbol 

Parameter 

Notes 

Min 

Max 

Unit 

Test Condition 

Ili 

Input Load Current 



1 

juA 

V|N = 5.5V 

Ilo 

Output Leakage Current 



10 

julA 

VouT = 5.5V 

Ipp 

Vpp Load Current Read 



10 

julA 

Vpp = 0 to Vcc> PGM = V|H 

•SB 

Vcc Standby 

Switching 

2 


45 

mA 

^ = V|H,f = 25 MHz 

Stable 

2 


30 

mA 

CS = V|H 

Icc 

Vcc Active Current 

1.3,7 


125 

mA 

CS = V|L, f = 25 MHz, louT = 0 mA 

V|L 

Input Low Voltage 

4 

-0.5 

0.8 

V 


V|H 

Input High Voltage 


2.0 

Vcc+1 

V 


VoL 

Output Low Voltage 



0.45 

V 

Iql = 2.1 mA 

VOH 

Output High Voltage 

5 

< 

o 

o 

o 

bo 


V 

Iqh == ~ 1 00 J^A 

5 

2.4 


V 

Iqh = -400 julA 

los 

Output Short Circuit 

6 


100 

mA 



NOTES: 

1 . Maximum current is with outputs unloaded. 

2. Ice standby current assumes no output loading, i.e., Iqh = Iql = 0 mA. 

3. Icc is the sum of current through Vcc 3 + Vcc 4 and does not include the current through Vcci and Vcc2- (Vcci and 
Vcc2 supply power to the output drivers. Vcc 3 and Vcc 4 supply power to the rest of the device.) 

4. Minimum DC voltage on input and output pins is -0.5V. During transitions, this level may undershoot to -2.0V for 
periods less than 20 ns. 

5. Maximum DC voltage on input and output pins is Vcc + 0-5V which may overshoot to Vcc + 2.0V for periods less than 
20 ns. 

6. One output shorted for no more than one second. Iqs is sampled but not 100% tested. 

7. Icc i^ax measured with a 10.1 1 fiF capacitor between Vcc and Vgs- 

8. This specification defines commercial product operating temperatures. 
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EXPLANATION OF AC SYMBOLS 

The nomenclature used for timing parameters are as 
per IEEE STD 662-1980 IEEE Standard Terminology 
for Semiconductor Memory. 

Each timing symbol has five characters. The first is 
always a “t” (for time). The sec ond c haracter repre- 
sents a signal name, e.g., (CLK, ALE, etc.). The third 
character represents the signal’s level (high or low) 
for the signal indicated by the second character. The 
fourth character represents a signal name at which a 
transition occurs marking the end of the time interval 
being specified. 


AC CHARACTERISTICS: READ OPERATION o^c < Ta < +70°C. Vcc = 5V ± 10 % 


Versions 

27960K2-25 

27960K1-20 

27960K2-20 

27960K1-16 

Unit 

25 MHz 

2 Wait States 

20 MHz 

1 Wait State 

20 MHz 

2 Wait States 

16 MHz 

1 Wait State 

No 

Symbol 

Characteristic 

Notes 

Min 

Max 

Min 

Max 

Min 

Max 

Min 

Max 

1 

tAVCoH 

Address Valid to 

CLK High 

CLKO 

12 


18 


10 


15 


ns 

2 

tAVLH 

Address Valid 
to ALE High 


10 


10 


10 


10 


ns 

3 

flLLH 

ALE Low to ALE High 


12 


12 


12 


12 


ns 

4 

flHAX 

ALE High to 

Address Invalid 


8 


8 


8 


8 


ns 

5 

fSVCH 

^ Valid 
to CLK High 

1,5 

5 


8 


7 


8 


ns 

6 

fCNHSX 

CLK High to ^ 

Invalid 

2 

0 


0 


0 


0 


ns 

7 

tCHQV 

CLK High to Data Valid 

7 


33 


43 


38 


45 

ns 

8 

tCHQX 

CLK High to Data Invalid 


7 


7 


7 


7 


ns 

9 

fCHQZ 

CLK High to Data High-Z 

6 


30 


35 


35 


35 

ns 

10 

fevcH 

BLAST Valid to 

CLK High 


15 


15 


15 


15 


ns 

11 

tCHBX 

CLK High to 

BLAST Invalid 

3 

5 

35 ! 

5 

45 

5 

45 

5 

45 

ns 


The fifth character represents the signal level indi- 
cated for the fourth character. The list below shows 
character representations. 


A: 

Address 

R: 

Reset 

B: 

BLAST 

Q; 

Data 

C; 

Clock 

S: 


H: 

Logic High Level 

t: 

Time 

L: 

ALE/ Logic Low Level 

V: 

Valid 

P: 

Vpp Programming Voltage 

Z: 

Tri-state level 


X: No longer a valid “driven” logic level 


NOTES: 

1 . Valid signal level is meant to be either a logic high or logic low. 

2. tCfyjHSX— The subscript N represents the number of wait states for this parameter. CS can be de-asserted (high) after the 
nu mber o f wait states (N) has expired. The EPROM will continue to burst out data for the current cycle. 

3. BLAST must be returned high before the next rising clock edge. 

4. The sum of tcHQV + tAvcH + NCLK will not equal actual tAvov if independent test conditions are used to obtain Iavch 
and tcHQV (N = number of wait states). 

5. CS must be deasserted after every burst read (see Figure 7). 

6. Sampled, not 100% tested. The transition is measured ±500 mV from steady state voltage. 

7. For capacitive loads above 120 pF, tcHQV can be derated by 1 ns/20 pF. 
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AC CONDITIONS OF TEST 

Input Rise and Fall Times 

(10% to 90%) 4 ns 

Input Pulse Levels 0.45V to 2.4V 

Input Timing Reference Level 1 .5V 

Output Timing Reference Level 0.8V and 2.0V 


Table 2. Mode Table 


MODE 

CS 

PGM 

BLAST 

ALE 

RESET 

Ag 

Vpp 

Vcc 

OUTPUT 

Read 

V|L 

V|H 

V|h(1) 

V,h(2) 

V|H 

XW 

Vcc 

Vcc 

Dout 

Standby (6) 

V|H 

X 

X 

X 

V|H 

X 

Vqc(5) 

Vcc 

HighZ 

Program 

V|L 

V|L 

V|H 

V,h(2) 

V|H 

X 

(3) 

(3) 

Din 

Program Verify 

V|L 

V|H 

V,h(1) 

V|H 

VlH 

X 

(3) 

(3) 

Dout 

Program Inhibit 

V|H 

X 

X 

X 

V|H 

X 

(3) 

(3) 

High Z 

ID Byte 0: Manufacturer 

V|L 

V|H 

V,h(1) 

V|h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

89H 

ID Byte 1: Part (27960) 

V|L 

V|H 

V,h(1) 

V|h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

EOH 

ID Byte 2: KX 

V|L 

V|H 

V,h(1) 

V|h(2) 

V|H 

V|d(3) 

Vcc 

Vcc 

OOB 

ID Byte 3: 1 Wait-State 

2 Wait-States 

VlL 

V|H 

V,h(1) 

V,h(2) 

I 

V|H 

V|d(3) 

Vcc 

Vcc 

01B 

10B 

Reset 

X 

X 

X 

X 

V|L 

X 

Vcc 

Vcc 

HighZ 


NOTES: 

1. V|H until data terminated at which time BLAST must go to V|i_. 

2. Need to toggle from V|h to V|l to V|h to latch address. 

3. See DC Programming Characteristics for Vcc. Vid and Vpp voltages. 

4. X can be V|l or Vm- 

5. Vpp = Vcc to meet standby current specification. Vcc > Vpp > V|l will c ause a slight increase in standby current. 

6. The device must be in the idle state (by asserting RESET or using BLAST) before going into standby. 
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CAPACITANCEd) Ta = 25”C, f = 1.0 MHz 


Symbol 

Parameter 

Typ 

Max 

Unit 

Condition 

C|N 

Input Capacitance 

4 

6 

pF 

> 

o 

II 

z 

> 

CoUT 

Output Capacitance 

12 

15 

pF 

VoUT = OV 

Cypp 

Vpp Capacitance 

40 

45 

pF 

> 

o 

II 

z 

> 


CLOCK CHARACTERISTICS 


Versions 

25 MHz 

20 MHz 

16 MHz 

Units 

Symbol 

Parameter 

Min 

Max 

Min 

Max 

Min 

Max 

CLK 

Period 

40 


50 


62.5 


ns 

Ts 

Rise Time 


10 


10 


10 

ns 

T4 

Fall Time 


10 


10 


10 

ns 

T2 

Low Time 

7 


8 


11 


ns 

Ts 

High Time 

7 


8 


11 


ns 


Max CLK Rise Time during Programming is 100 ns 


AC TESTING LOAD CIRCUIT 


2.1V 


DEVICE 

UNDER 

TEST 


” 290237-15 

For tQHQz C|_ = 5 pF and Rl = 4050 
Cl includes jig capacitance 


^ 78on 


CL=120pF 


NOTE: 

1. Sampled, not 100% tested 


AC INPUT/OUTPUT REFERENCE WAVEFORMS 


INPUT 
V, 


□C 


TIMING PARAMETER 


X OUTPUT 

H— ^ ' VoL 


290237-14 

AC test inputs are driven at 2.4V (Vqh) ^or a logic ‘1’ 
and 0.45V (Vql) for a logic ‘O’. 

Input timing begins at 1 .5V. 

Output timing ends at V|h (2.0V) and Vil (0.8V) 

Input Rise and fall times (10% to 90%) < 4.0 ns 


CLOCK WAVEFORM 



290237-16 
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Program/Program Verify 

Initially, and after each erasure, all bits of the 
EPROM are in the “Vs” state. Data is introduced by 
selectively programming “O’s” into the desired bit 
locations. Although only “O’s” can be programmed, 
both “Vs” and “O’s” can be present in the data 
word. Ultraviolet erasure is the only way to change 
“O’s” to “Vs”. 

Program mode is entered when Vpp is raised to 
12.75V. Program/Verify operation is synchronous 
with the clock and can only be initiated following an 
idle state. Program and Program Verify take place in 
3 clock cycles. In the first clock cycle, addresses 
and data are input and programming occurs. Pro- 
gram Verify follows in the second clock cycle and 
the third clock cycle terminates synchronous Pro- 
gram/Verify operation, returning the state machine 
to the idle state with outputs at high impedance. 

As in the Read mode, A 2 -A 16 point to a four byte 
block in the memory array. During Programming the 
internal address increment circuitry is disabled and 
the programmer must supply Aq and A-j to point to 
an individual byte within the four byte block that Is to 
be programmed. Only one byte is programmed in 
each 3 cycle program/Verify sequence. 


Program Inhibit 

Program Inhibit mode allows parallel programming 
and verification of multiple devices with different 
data. With Vpp at 12.75V, a Program/Verify se- 
quence is initiated for any device that receive a val- 
id ALE p ulse a nd rising clock edge while CS is as- 
serted. A PGM pulse programs data in the first cycle 
of the sequence and data for Program Verify is out- 
put in the second cycle. The Program/Verify se- 
quence is inhibited on any de vices for which CS is 
not asserted during the first (ALE) cycle. Data will 
not be programmed and the outputs will remain In 
their high Impedance state. 


inteligent Identifier^ Mode 

The device’s manufacturer, product type, and con- 
figuration are stored in a four byte block that can be 


accessed by using the inteligent Identifier^ mode. 
The programmer can verify the device identifier and 
choose the programming algorithm that corresponds 
to the Intel 27960KX. The inteligent Identifier can 
also be used to verify that the product is configured 
with the desired Read mode options for wait states. 

Inteligent Identifier mode is entered when Ag (pin 
32) is raised to its high voltage (Vh) level. The inter- 
nal state machine is then set for inteligent Identifier 
Read operation. Reading the Identifier Is similar to a 
Read operation on a one wait state configured prod- 
uct. Up to four bytes can be read in a single burst 
access, inteli gent Ide ntifier read is terminated by a 
synchronous BLAST input, returning the state ma- 
chine to the idle state with outputs at high imped- 
ance. 


The four byte block code for the inteligent Identifier 
code is located at address OOH through 03H and is 
encoded as follows: 


MEANING 

(A 1 .A 0 ) 

DATA 

Intel ID 

Byte 00 

89h 

27960 

Byte 01 

EOh 

KX 

Byte 10 

00b 

1 wait state 

Byte 1 1 

01b 

2 wait states 

Byte 1 1 

10b 


RESET MODE 

Due to the synchronous nature of the 27960KX, the 
various operating modes must be initiated from a 
known idle state. During normal operation, the inter- 
nal state machine returns to an idle sta te at the ter- 
mination of a bus access (after BLAST is asserted). 

During initial device power up, the state machine is 
In an Indeterminant state. The reset mode is provid- 
ed to force operation in to the idle state. Reset mode 
is entered when the RESET pin is asserted. Output 
pins are asynchronously set to the high impedance 
state and address latches are put into the flow 
through mode. A reset is successfully completed 
and the st ate ma chine set in an idle state in the 
cycle after RESET has been asserted for a minimum 
of 1 0 clock cycles and deasserted for five clock cy- 
cles. 
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Figure 8. Quick-Pulse ProgrammingTM Algorithm 
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QUICK-PULSE PROGRAMMING 
ALGORITHM 

The Quick-Pulse Programming algorithm programs 
Intel’s 27960KX. Developed to substantially reduce 
programming throughput time, this algorithm allows 
optimized equipment to program a 27960KX in un- 
der 17 seconds. Actual programming time depends 
on the programmer used. 

The Quick-Pulse Programming algorithm uses a 
100 jxs pulse followed by a byte verfication to deter- 
mine when the addressed byte is correctly pro- 
grammed. The algorithm terminates if 25 IOOjlls 


pulses fail to program a byte. Figure 8 shows the 
27960KX Quick-Pulse Programming algorithm flow- 
chart. 

The entire program-pulse, byte-verify sequence is 
performed with Vcc = 6.25V and Vpp = 12.75V. 
The programming equipment must establish Vcc be- 
fore applying voltages to any other pins. When pro- 
gramming is complete, all bytes should be compared 
to the original data with Vcc = 5.0V and Vpp = 
12.75V. 


D.C, PROGRAMMING CHARACTERISTICS Ta = 25^C ±5“C 


Symbol 

Parameter 

Notes 

Min 

Max 

Unit 

Test Condition 

Ili 

Input Load Current 



10 

jLlA 

V|N = ViHorViL 

•cc 

Vcc Program Current 

1 


125 

mA 

> 

II 

|CO 

lo 

Ipp 

Vpp Program Current 

1 


50 

mA 

_J 

> 

11 

|cn 

lo 

V|L 

Input Low Voltage 


-0.5 

0.8 

V 


V|H 

Input High Voltage 


2.0 

Vcc + 0.5 

V 


VoL 

Output Low Voltage (Verify) 



0.40 

V 

Iql = 2.1 mA 

VOH 

Output High Voltage (Verify) 


o 

1 

o 

CO 


V 

Iqh = -400 {xA 

V|D 

Ag inteligent Identifier Voltage 


11.5 

12.5 

V 


Vcc 

Supply Voltage (Program) 

2 

6.0 

6.5 

V 


Vpp 

Program Voltage 

2 

12.5 

13.0 

V 



NOTES: 

1 . The maximum current value is with outputs unloaded. 

2. Vcc must be applied simultaneously or before Vpp and remove simultaneously or after Vpp. 

3. During programming clock levels are V|h and V|l. 
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AC PROGRAMMING, RESET AND ID CHARACTERISTICS Ta = 25°C ± 5°C 


No 

Symbol 

Parameter 

Notes 

Min 

Max 

Units 

1 

UVPL 

Address Valid to PGM Low 


2 


JLLS 

2 

tCHAX 

CLK High to Address Invalid 


50 


ns 

3 

tLLCH 

ALE Low to CLK High 

1 

50 


ns 

4 

tCHLH 

CLK High to High 

2 

50 


ns 

5 

tsVCH 

^ Valid to CLK High 


50 


ns 

6 

tCHSX 

CLK High to CS Invalid 

3 



ns 

7 

tCHQV 

CLK High to Dqut Valid 



100 

ns 

8 

tCHQX 

CLK High to Dqut Invalid 


0 


ns 

9 

^BVCH 

BLAST Valid to CLK High 


50 


ns 

10 

tCHBX 

CLK High to BLAST Invalid 

4 

50 


ns 

11 

tQVPL 

DATA Valid to PGM Low 


2 


JUlS 

12 

tpLPH 

PGM Program Pulse Width 


95 

105 

JULS 

13 

tpHQX 

PGM High to D|n Invalid 


2 


fJbS 

14 

tCLPL 

CLK Low to PGM Low 


50 


ns 

15 

tQZCH 

D|n in Tri-State to CLK High 


2 


JLLS 

16 

tvcs 

Vcc Program Voltage to CLK High 

7 

2 


/XS 

17 

Wps 

Vpp Program Voltage to CLK High 

7 

2 


JLLS 

18 

UpHCH 

Ag V|D Voltage to CLK High 


2 


JLLS 

19 

tCHApX 

CLK High to A9 not V|d Voltage 


2 


/XS 

20 

tpVCH 

RESET Valid to CLK High 

6 

50 


ns 

21 

tCHCL 

CLK High to CLK Low 

5 

100 


ns 

22 

tCLCH 

CLK Low to CLK High 

5 

100 


ns 


NOTES: 

1 . If CS is low, ALE can go low no sooner than the falling edge of the previous CLK. 

2. ALE must return high prior to the next rising edge of clock. 

3. CS mus t remain low until after the rising edge CLK1. 

4. BLAST must return high prior to the next rising edge of CLK. 

5. Max CL K rise/fall time is 100 ns. 

6. RESET must be held low for 10 cycles and high for 5 cycles before performing a read. 

7. Vcc fTiust be applied simultaneously or before Vpp and removed simultaneously or after Vpp. 
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HIGH-PERFORMANCE 32-BIT LOCAL 
AREA NETWORK COPROCESSOR 
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■ Performs Complete CSMA/CD Medium 
Access Control (MAC) Functions— 
Independently of CPU 

— IEEE 802.3 (EOC) Frame Delimiting 
— HDLC Frame Delimiting 

■ Supports Industry Standard LANs 
— IEEE TYPE 10BASE-T, 

IEEE TYPE 10BASE5 (Ethernef^), 
IEEE TYPE 10BASE2 (Cheapernet), 
IEEE TYPE 1 BASES (StarLAN), 
and the Proposed Standard 
10BASE-F 

— Proprietary CSMA/CD Networks Up 
to 20 Mb/s 

■ On-Chip Memory Management 
— Automatic Buffer Chaining 

— Buffer Reclamation after Receipt of 
Bad Frames; Optional Save Bad 
Frames 

— 32-Bit Segmented or Linear (Flat) 
Memory Addressing Formats 

■ Network Management and Diagnostics 
— Monitor Mode 

— 32-Bit Statistical Counters 

■ 82586 Software Compatible 


■ Optimized CPU Interface 

— Optimized Bus Interface to Intel’s 
i486TMDX, i486TMSX and 80960CA 
Processors 

— Supports Big Endian and Little 
Endian Byte Ordering 

■ 32-Bit Bus Master Interface 
— 106 MB/s Bus Bandwidth 
— Burst Bus Transfers 

— Bus Throttle Timers 
— Transfers Data at 100% of Serial 
Bandwidth 

— 128-Byte Receive FIFO, 64-Byte 
Transmit FIFO 

■ Self-Test Diagnostics 

■ Configurable Initialization Root for Data 
Structures 

■ High-Speed, 5V, CHMOS** IV 
Technology 

■ 132-Pin Plastic Quad Flat Pack (PQFP) 
and PGA Package 

(See Packaging Spec Order No. 240800-001 , 

Package Type KU and A) 

i486 is a trademark of Intel Corporation. 

* Ethernet is a registered trjademark of Xerox Corporation. 
**CHMOS is a patented process of Intel Corporation. 
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Figure 1. 82596CA Block Diagram 
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INTRODUCTION 

The 82596CA is an intelligent, high-performance 
32-bit Local Area Network coprocessor. The 
82596CA implements the CSMA/CD access method 
and can be configured to support all existing IEEE 
802.3 standards— TYPES 10BASE-T, 10BASE5. 
10BASE2, 1 BASES, and 10BROAD36. It can also be 
used to implement the proposed standard TYPE 
10BASE-F. The 82596CA performs high-level com- 
mands, command chaining, and interprocessor com- 
munications via shared memory, thus relieving the 
host CPU of many tasks associated with network 
control. All time-critical functions are performed in- 
dependently of the CPU, this increases network per- 
formance and efficiency. The 82596CA bus interfac- 
es is optimized for Intel’s i486TMSX, i486TMDX, 
80960CA, and 80960KB processors. 

The 82596CA Implements all IEEE 802.3 Medium 
Access Control and channel interface functions, 
these include framing, preamble generation and 
stripping, source address generation, destination ad- 
dress checking, short-frame detection, and automat- 
ic length-field handling. Data rates up to 20 Mb/s are 
supported. 

The 82596CA provides a powerful host system inter- 
face. It manages memory structures automatically, 
with command chaining and bidirectional data chain- 
ing. An on-chip DMA controller manages four chan- 
nels, this allows autonomous transfer of data blocks 
(buffers and frames) and relieves the CPU of byte 
transfer overhead. Buffers containing errored or col- 
lided frames can be automatically recovered without 
CPU intervention. The 82596CA provides an up- 
grade path for existing 82586 software drivers by 
providing an 82586-software-compatible mode that 
supports the current 82586 memory structure. The 
82586CA also has a Flexible memory structure and 
a Simplified memory structure. The 82596CA can 
address up to 4 gigabytes of memory. The 82596CA 
supports Little Endian and Big Endian byte ordering. 

The 82596CA bus interface can achieve a burst 
transfer rate of 106 MB/s at 33 MHz. The bus inter- 
face employs bus throttle timers to regulate 
82596CA bus use. Two large, independent FIFOs — 
128 bytes for Receive and 64 bytes for Transmit— 
tolerate long bus latencies and provide programma- 
ble thresholds that allow the user to optimize bus 
overhead for any worst-case bus latency. The high- 
performance bus Is capable of back-to-back trans- 
mission and reception during the IEEE 802.3 9.6-juts 
Interframe Spacing (IFS) period. 

The 82596CA provides a wide range of diagnostics 
and network management functions, these include 
internal and external loopback, exception condition 


tallies, channel activity Indicators, optional capture 
of all frames regardless of destination address 
(promiscuous mode), optional capture of errored or 
collided frames, and time domain reflectometry for 
locating fault points on the network cable. The sta- 
tistical counters, in 32-bit segmented and linear 
modes, are 32-bits each and Include CRC errors, 
alignment errors, overrun errors, resource errors, 
short frames, and received collisions. The 82596CA 
also features a monitor mode for network analysis. 
In this mode the 82596CA can capture status bytes, 
and update statistical counters, of frames monitored 
on the link without transferring the contents of the 
frames to memory. This can be done concurrently 
while transmitting and receiving frames destined for 
that station. 

The 82596CA can be used in both baseband and 
broadband networks. It can be configured for maxi- 
mum network efficiency (minimum contention over- 
head) with networks of any length. Its highly flexible 
CSMA/CD unit supports address field lengths of 
zero through six bytes — configurable to either IEEE 
802.3/Ethernet or HDLC frame delimitation. It also 
supports 16- or 32-blt cyclic redundancy checks. 
The CRC can be transferred directly to memory for 
receive operations, or dynamically inserted for trans- 
mit operations. The CSMA/CD unit can also be con- 
figured for full duplex operation for high throughput 
in point-to-point connections. 


82596 B-Stepping 

The 82956 B-Step incorporates new features com- 
pared to the 82596 A1 stepping. The following Is a 
summary of the 82596 B-step new features. 

• The 82596 B-step transmit buffers can now be 
byte aligned. 

• In big endian mode, and when configured to Lin- 
ear mode, the 82596 B-step treats 32-bit address 
pointers as big endian 32-bit entities. However, 
the SCB absolute address and statistical coun- 
ters are still treated as two 16-blt big endian enti- 
ties. This big endian 32-blt entity support is con- 
figured through the SYSBUS byte; not setting this 
mode will configure the 82596 B-step to be 100% 
compatible to the 82596 Al-step big endian 
mode. 

• The 82596 B-step has improved performance on 
back-to-back frame transmission. 

• The 82596 B-step can be configured to reread 
the next Command Block on the CB list upon re- 
ceiving a CU RESUME Control Command. 

The 82596CA is fabricated with Intel’s reliable, 5-V, 
CHMOS IV (process 648.8) technology. It is avail- 
able in a 1 32-pin PQFP or PGA package. 
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82596CA PGA Cross Reference by Pin Name 
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PIN DESCRIPTIONS 


Symbol 


PQFP 
Pin No. 


Type 


CLK 


D0-D31 


14-53 


I/O 


DP0-DP3 


4-7 


I/O 


PCHK 


127 


A31-A2 


70-108 


BE3-BE0 


109-114 


W/R 


120 


Name and Function 


CLOCK. The system clock input provides the fundamental timing for 
the 82596. It is a IX CLK Input used to generate the 82596 clock and 
requires TTL levels. All external timing parameters are specified in 
reference to the rising edge of CLK. 

DATA BUS. The 32 Data Bus lines are bidirectional, tri-state lines that 
provide the general purpose data path between the 82596 and 
memory. With the 82 596 th e bus can be either 1 6 or 32 bits wide; this 
is determined by the BS1 6 signal. The 82596 always drives all 32 data 
lines during Write operations, even with a 16-bit bus. D31 - DO are 
floated after a Reset or when the bus is not acquired. 

These lines are inputs during a CPU Port access; in this mode the CPU 
writes the next address to the 82596 through the data lines. During 
PORT commands (Relocatable SCP, Self-Test, Reset and Dump) the 
address must be aligned to a 16-byte boundary. This frees the D 3 -D 0 
lines so they can be used to distinguish the commands. The following 
is a summary of the decoding data. 


DO 

D1 

D2 

D3 

D31-D4 

Function 

0 

0 

0 

0 

0000 

Reset 

0 

1 

0 

0 

ADDR 

Relocatable SCP 

1 

0 

0 

0 

ADDR 

Self-Test 

1 

1 

0 

0 

ADDR 

Dump Command 


DATA PARITY. These are tri-stated data parity pins. There is one 
parity line for each byte of the data bus. The 82596 drives them with 
even-parity Information during write operations having the same timing 
as data writes. Likewise, even-parity information, with the same timing 
as read information, must be driven back to the 82596 over these pins 
to ensure that the correct parity check status is indicated by the 
82596. 

PARITY CHECK. This pin is driven high one clock after RDY to inform 
Read operations of the parity status of data sampled at the end of the 
previous clock cycle. When driven low it Indicates that Incorrect parity 
data has been sampled. It only checks the parity status of enabled 
bytes, which are indicated by the Byte Enable and Bus Size signals. 
PCHK is only valid for one clock time after data read is returned to the 
82596; i.e., it is inactive (high) at all other times. 

ADDRESS LINES. These 30 tri-stated Address lines output the 
address bits required for memory operation. These lines are floated 
after a Reset or when the bus is not acquired. 

BYTE ENABLE. These tri-stated signals are used to indicate which 
bytes are involved with the current memory access. The number of 
Byte Enable signals asserted indicates the physical size of the data 
being transferred (1, 2, 3, or 4 bytes). 

• BEO indicates D7-D0 

• BE1 indicates D15-D8 

• BE2 indicates D23-D16 

• BE3 indicates D31 -D24 

These lines are floated after a Reset or when the bus Is not acquired. 

WRITE/READ. This dual function pin is used to distinguish Write and 
Read cycles. This line is floated after a Reset or when the bus is not 
acquired. 


4-65 



82596CA 




int^. 


PIN DESCRIPTIONS (Continued) 


Symbol 

PQFP 

Pin No. 

Type 

Name and Function 

ADS 

124 

0 

ADDRESS STATUS. The 82596 uses this tri-state pin to indicate to 

indicatejthat a vaiid bus cycle has begun and that A31 -A2, BE3-BE0, 
and W/R are being driven. It is asserted during t1 bus states. This line 
is floated after a Reset or when the bus is not acquired. 

RDY 

130 

1 

READY. Active low. This signal Is the acknowledgment from 
addressed memory that the transfer cycle can be completed. When 
high, it causes wait states to be inserted. It is Ignored at the end of the 
first clock of the bus cycle’s data cycle. This active-low signal does not 
have an Internal pull-up resistor. This signal must meet the setup and 
hold times to operate correctly. 

BRDY 

2 

1 

BURST READY. Active low. Burst Ready, like RDY, indicates that the 
external system has presented valid data on the data pins In response 
to a Read, or that the external system has accepted the 82596 data in 
response to a Write request. Also, like RDY, this signal is Ignored at 
the end of the first clock in a bus cycle. If the 82596 can still receive 
data from the previous cycle, ADS will not be asserted in the next 
clock cycle; however, Address and Byte Enable will change to reflect 
the next data Item expected by the 82596. BRDY will be sampled 
during each succeeding clock and if active, the data on the pins will be 
strobed to the 82596 or to external memory (read/write). BRDY 
operates exactly like READY during the last data cycle of a burst 
sequence and during nonburstable cycles. 

BLAST 

128 

0 

BURST LAST. A signal (active low) on this tri-state pin indicates that 
the burst cycle is finished and when BRDY is next returned it will be 
treated as a normal ready; i.e., another set of addresses will be driven 
with ADS or the bus will go idle. BLAST is not asserted If the bus is not 
acquired. 

AHOLD 

117 

1 

ADDRESS HOLD. This hold signal is active high, it allows another bus 
master to access the 82596 address bus. In a system where an 82596 
and an I486 processor share the local bus, AHOLD allows the cache 
controller to make a cache invalidation cycle while the 82596 holds the 
address lines. In response to a signal on this pin, the 82596 
immediately (i.e. during the next clock) stops driving the entire address 
bus (A31 -A2); the rest of the bus can remain active. For example, 
data can be returned for a previously specified bus cycle during 

Address Hold. The 82596 will not begin another bus cycle while 

AHOLD is active. 

BOFF 

116 

1 

BACKOFF. This signal Is active low, it Informs the 82596 that another 
bus master requires access to the bus before the 82596 bus cycle 
completes. The 82596 Immediately (i.e. during the next clock) floats Its 
bus. Any data returned to the 82596 while BOFF is asserted is ignored. 
BOFF has higher priority than RDY or BRDY; if two such signals are 
returned in the same clock period, BOFF is given preference. The 

82596 remains In Hold until BOFF goes high, then the 82596 resumes 
its bus cycle by driving out the address and status, and asserting ADS. 
BOFF should not be asserted during T1 . 

LOCK 

126 

0 

LOCK. This tri-state pin is used to distinguish locked and unlocked bus 
cycles. LOCK generates a semaphore handshake to the CPU. LOCK 
can be active for several memory cycles, it goes active during the first 
locked memory cycle (tl ) and goes Inactive at the last locked cycle 
(t2). This line is floated after a Reset or when the bus is not acquired. 
LOCK can be disabled via the sysbus byte in software. 
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PIN DESCRIPTIONS (Continued) 


Symbol 

PQFP 
Pin No. 

Type 

Name and Function 

BS16 

129 

1 

BUS SIZE. This signal allows the 82596CA to work with either 1 6- or 
32-bit bytes. Inserting BS16 low causes the 82596 to perform two 16- 
bit memory accesses when transferring 32-bit data. In little endian 
mode the D15-D0 lines are driven when BS16 is inserted, in Big 

Endian mode the D31 -D16 lines are driven. 

HOLD 

123 

0 

HOLD. The HOLD signal is active high, the 82596 uses it to request 
local bus mastership. In normal operation HOLD goes inactive before 
HLDA. The 82596 can be forced off the bus by deasserting HLDA or if 
the bus throttle timers expire. 

HLDA 

118 

1 

HOLD ACKNOWLEDGE. The HLDA signal is active high, it indicates 
that bus mastership has been given to the 82596. HLDA is internally 
synchronized; after HOLD is detected low, the CPU drives HLDA low. 

NOTE: 

Do not connect HLDA to Vcc—it cause a deadlock. A user wanting 

to give the 82596 permanent access to the bus should connect HLDA 
to HOLD. If HLDA goes inactive before HOLD, the 82596 will release 
the bus (by deasserting HOLD) within a maximum of within a specified 
number of bus cycles as specified in the 82596 User’s Manual. 

BREQ 

115 

1 

BUS REQUEST. This signal, when configured to an externally 
activated mode. Is used to trigger the bus throttle timers. 

PORT 

3 

1 

PORT. When this signal is received, the 82596 latches the data on the 
data bus into an Internal 32-bit register. When the CPU Is asserting this 
signal it can write Into the 82596 (via the data bus). This pin must be 
activated twice during all CPU Port access commands. 

RESET 

69 

1 

RESET. This active high, internally synchronized signal causes the 

82596 to terminate current activity. The signal must be high for at least 
five system clock cycles. After five system clock cycles and four TxC 
clock cycles the 82596 will execute a Reset when it receives a high 
RESET signal. When RESET returns to low the 82596 waits for the 
first CA signal and then begins the initialization sequence. 

LE/BE 

65 

, 1 

LITTLE ENDIAN/BIG ENDIANj;^his dual-function pin is used to 
select byte ordering. When LE/BE is high, little endian byte ordering is 
used; when low, big endian byte ordering is used for data In frames 
(bytes) and for control (SCB, RED, CBL, etc). 

CA 

119 

1 

CHANNEL ATTENTION. The CPU uses this pin to force the 82596 to 
begin executing memory resident Command blocks. The CA signal is 
internally synchronized. The signal must be high for at least one 
system clock. It is latched Internally on the high to low edge and then 
detected by the 82596. 

The first CA after a Reset forces the 82596 into the initialization 
sequence beginning at location 00FFFFF6h or an SCP address written 
to the 82596 using CPU Port access. All subsequent CA signals cause 
the 82596 to begin executing new command sequences from the SCB. 

iNT/nrr 

125 

0 

INTERRUPT. A high signal on this pin notifies the CPU that the 82596 
is requesting an interrupt. This signal is an edge triggered interrupt 
signal, and can be configured to be active high or low. 
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PIN DESCRIPTIONS (Continued) 


Symbol 

PQFP 

Pin No. 

Type 

Name and Function 

Vcc 

17 Pins 


POWER. + 5V ±10%. 

Vss 

17 Pins 


GROUND. 0 V. 

TxD 

54 

0 

TRANSMIT DATA. This pin transmits data to the serial link. It Is high 
when not transmitting. 


64 

1 

TRANSMIT CLOCK. This signal provides the fundamental timing for 
the serial subsystem. The clock is also used to transmit data 
synchronously on the TxD pin. For NRZ encoding, data is transferred 
to the TxD pin on the high to low clock transition. For Manchester 
encoding, the transmitted bit center is aligned with the low to high 
transition. Transmit clock must always be running for proper device 
operation. 

LPBK 

58 

0 

LOOPBACK. This TTL-level control signal enables the loopback 
mode. In this mode serial data on the TxD input is routed through the 
82C501 Internal circuits and back to the RxD output without driving the 
transceiver cable. To enable this signal, both internal and external 
loopback need to be set with the Configure command. 

RxD 

60 

1 

RECEIVE DATA. This pin receives NRZ serial data only. It must be 
high when not receiving. 


59 

1 

RECEIVE CLOCK. This signal provides timing information to the 
internal shifting logic. For NRZ data the state of the RxD pin is 
sampled on the high to low transition of the clock. 

rTs 

57 

0 

REQUEST TO SEND. When this signal is low the 82596 informs the 
external Interface that it has data to transmit. It is forced high after a 

Reset or when transmission is stopped. 

UTS 

62 

1 

CLEAR TO SEND. An active-low signal that enables the 82596 to 
send data. It is normally used as an interface handshake to RTS. 
Asserting CTS high stops transmission. CTS Is internally synchronized. 

If CTS goes inactive, meeting the setup time to the TxC negative edge, 
the transmission will stop and RTS will go inactive within, at most, two 
TxC cycles. 

CRS 

63 

1 

CARRIER SENSE. This signal is active low, it is used to notify the 

82596 that traffic Is on the serial link. It Is only used if the 82596 Is 
configured for external Carrier Sense. In this configuration external 
circuitry Is required for detecting traffic on the serial link. CRS is 
internally synchronized. To be accepted, the signal must remain active 
for at least two serial clock cycles (for CRSF = 0). 

CDT 

61 

1 

COLLISION DETECT. This active-low signal informs the 82596 that a 
collision has occurred. It is only used if the 82596 is configured for 
external Collision Detect. External circuitry is required for collision 
detection. CDT is internally synchronized. To be accepted, the signal 
must remain active for at least two serial clock cycles (for CDTF = 0). 
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82596 AND HOST CPU INTERACTION 

The 82596CA and the host CPU communicate 
through shared memory. Because of its on-chip 
DMA capability, the 82596 can make data block 
transfers (buffers and frames) independently of the 
CPU; this greatly reduces the CPU byte transfer 
overhead. 

The 82596 is a multitasking coprocessor that com- 
prises two independent logical units — the Command 
Unit (CU) and the Receive Unit (RU). The CU exe- 
cutes commands from shared memory. The RU han- 
dles all activities related to frame reception. The in- 
dependence of the CU and RU enables the 82596 to 
engage In both activities simultaneously — the CU 
can fetch and execute commands from memory 
while the RU is storing received frames in memory. 
The CPU is only involved with this process after the 
CU has executed a sequence of commands or the 
RU has finished storing a sequence of frames. 

The CPU and the 82596 use the hardware signals 
Interrupt (INT) and Channel Attention (CA) to initiate 
communication with the System Control Block 
(SCB), see Figure 4. The 82596 uses INT to alert the 
CPU of a change in the contents of the SCB, the 
CPU uses CA to alert the 82596. 

The 82596 has a CPU Port Access state that allows 
the CPU to execute certai n funct ions without ac- 
cessing memory. The 82596 PORT pin and data bus 
pins are used to enable this feature. The CPU can 
directly activate four operations when the 82596 is in 
this state. 

• Write an alternative System Configuration Pointer 
(SCP). This can be used when the 82596 cannot 
use the default SCP address space. 

• Write a different Dump Command Pointer and ex- 
ecute Dump. This can be used for troubleshoot- 
ing No Response problems. 

• The CPU can reset the 82596 via software with- 
out disturbing the rest of the system. 

• A self-test can be used for board testing; the 
82596 will execute a self-test and write the re- 
sults to memory. 


82596 BUS INTERFACE 

The 82596CA has bus interface timings and pin defi- 
nitions that are compatible with Intel’s 32-bit 
i486TMSX and i486TMDX microprocessors. This 
eliminates the need for additional bus interface logic. 
Operating at 33 MHz, the 82596’s bus bandwidth 
can be as high as 106 MB/s. Since Ethernet only 
requires 1.25 MB/s, this leaves a considerable 
amount of bandwidth for the CPU. The 82596 also 
has a bus throttle to regulate its use of the bus. Two 
timers can be programmed through the SCB: one 
controls the maximum time the 82596 can remain on 
the bus, the other controls the time the 82596 must 
stay off the bus (see Figure 5). The bus throttle can 
be programmed to trigger internally with HLDA or 
externally with BREQ. These timers can restrict the 
82596 HOLD activation time and improve bus utiliza- 
tion. 

82596 MEMORY ADDRESSING 

The 82596 has a 32-bit memory address range, 
which allows addressing up to four gigabytes of 
memory. The 82596 has three memory addressing 
modes (see Table 1). 

• 82586 Mode. The 82596 has a 24-bit memory 
address range. The System Control Block, Com- 
mand List, Receive Descriptor List, and Buffer 
Descriptors must reside In one 64-KB memory 
segment. Transmit and Receive buffers can re- 
side in a 24-bit address space. 

• 32-Bit Segmented Mode. The 82596 has a 32- 
bit memory address range. The System Control 
Block, Command List, Receive Descriptor List, 
and Buffer Descriptors must reside In one 64-KB 
memory segment. Transmit and Receive buffers 
can reside in a 32-bit address space. 

• Linear Mode. The 82596 has a 32-bit memory 
address range. Any memory structure can reside 
anywhere within the 32-bit memory address 
range. 
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Figure 4. 82596 and Host CPU Intervention 
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Figure 5. Bus Throttle Timers 


Table 1. 82596 Memory Addressing Formats 


Pointer or Offset 

Operation Mode 

82586 

32-Bit 

Segmented 

Linear 

ISCP Address 

24-Bit Linear 

32-Bit Linear 

32-Bit Linear 

SCB Address 

Base (24) + Offset (1 6) 

Base (32) + Offset (16) 

32-Bit Linear 

Command Block Pointers 

Base (24) + Offset (16) 

Base (32) + Offset (16) 

32-Bit Linear 

Rx Frame Descriptors 

Base (24) + Offset (16) 

Base (32) + Offset (16) 

32-Bit Linear 

Tx Frame Descriptors 

Base (24) + Offset (16) 

Base (32) + Offset (16) 

32-Bit Linear 

Rx Buffer Descriptors 

Base (24) + Offset (16) 

Base (32) + Offset (16) 

32-Bit Linear 

Tx Buffer Descriptors 

Base (24) + Offset (16) 

Base (32) + Offset (16) 

32-Bit Linear 

Rx Buffers 

24-Bit Linear 

32-Bit Linear 

32-Bit Linear 

Tx Buffers 

24-Bit Linear 

32-Bit Linear 

32-Bit Linear 


4-70 





82596CA 


[PI^dlLDRitlOKl/^^Y 


inl^ 



Figure 6. 82596 Shared Memory Structure 


82596 SYSTEM MEMORY STRUCTURE 

The Shared Memory structure consists of four parts; 
the Initialization Root, the System Control Block, the 
Command List, and the Receive Frame Area (see 
Figure 6). 

The Initialization Root is in an established location 
known to the host CPU and the 82596 (00FFFFF6h). 
However, the CPU can establish the Initialization 
Root in another location by using the CPU Port ac- 
cess. This root is accessed during initialization, and 
points to the System Control Block. 


The System Control Block serves as a bidirectional 
mail drop for the host CPU and the 82596 CU and 
RU. It is the central point through which the CPU and 
the 82596 exchange control and status Information. 
The SCB has two areas. The first contains instruc- 
tions from the CPU to the 82596. These include: 
control of the CU and RU (Start, Abort, Suspend, 
and Resume), a pointer to the list of CU commands, 
a pointer to the Receive Frame Area, a set of Inter- 
rupt Acknowledge bits, and the T-ON and T-OFF 
timers for the bus throttle. The second area contains 
status information the 82596 is sending to the CPU. 
Such as, the CU and RU states (Idle, Active 
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Ready, Suspended, No Receive Resources, etc.), in- 
terrupt bits (Command Completed, Frame Received, 
CD Not Ready, and RU Not Ready), and statistical 
counters. 

The Command List functions as a program for the 
CU; individual commands are placed in memory 
units called Command Blocks (CBs). These CBs 
contain the parameters and status of specific high- 
level commands called Action Commands; e.g.. 
Transmit or Configure. 

Transmit causes the 82596 to transmit a frame. The 
Transmit CB contains the destination address, the 
length field, and a pointer to a list of linked buffers 
holding the frame that is to be constructed from sev- 
eral buffers scattered throughout memory. The 
Command Unit operates without CPU intervention; 
the DMA for each buffer, and the prefetching of ref- 
erences to new buffers, is performed in parallel. The 
CPU is notified only after a transmission is complete. 

The Receive Frame Area is a list of Free Frame De- 
scriptors (descriptors not yet used) and a list of user- 
prepared buffers. Frames arrive at the 82596 unso- 
licited; the 82596 must always be ready to receive 
and store them in the Free Frame Area. The Re- 
ceive Unit fills the buffers when It receives frames, 
and reformats the Free Buffer List into received- 
frame structures. The frame structure is, for all prac- 
tical purposes. Identical to the format of the frame to 
be transmitted. The first Frame descriptor is refer- 
enced by the SCB. Unless the 82596 is configured 
to Save Bad Frames, the frame descriptor, and the 
associated buffer descriptor, which Is wasted when 
a bad frame is received, are automatically reclaimed 
and returned to the Free Buffer List. 

Receive buffer chaining (storing incoming frames in 
a linked buffer list) significantly improves memory 
utilization. Without buffer chaining, the user must al- 
locate consecutive blocks of memory, each capable 
of containing a maximum frame (for Ethernet, 1518 
bytes). Since an average frame is about 200 bytes, 
this is very inefficient. With buffer chaining, the user 
can allocate small buffers and the 82596 will only 
use those that are needed. 

Figure 7 A-D illustrates how the 82596 uses the 
Receive Frame Area. Figure 7A shows an unused 
Receive Frame Area composed of Free Frame De- 
scriptors and Free Receive Buffers prepared by the 
user. The SCB points to the first Frame Descriptor of 
the Frame Descriptor List. Figure 7B shows the 
same Receive Frame Area after receiving one 
frame. This first frame occupies two Receive Buffers 
and one Frame Descriptor — a valid received frame 
will only occupy one Frame Descriptor. After receiv- 


ing this frame the 82596 sets the next Free Frame 
Descriptor RBD pointer to the next Free RBD. Figure 
7C shows the RFA after receiving a second frame. 
In this example the second frame occupies only one 
Receive Buffer and one RFD. The 82596 again sets 
the RBD pointer. This process is repeated again in 
Figure 7D, showing the reception of another frame 
using one Receive Buffer; in this example there is an 
extra Frame Descriptor. 


TRANSMIT AND RECEIVE MEMORY 
STRUCTURES 

There are three memory structures for reception and 
transmission. The 82586 memory structure, the 
Flexible memory structure, and the Simplified memo- 
ry structure. The 82586 mode is selected by config- 
uring the 82596 during initialization. In this mode all 
the 82596 memory structures are compatible with 
the 82586 memory structures. 

When the 82596 is not configured to the 82586 
mode, the other two memory structures. Simplified 
and Flexible, are available for transmitting and re- 
ceiving. These structures can be selected on a 
frame-by-frame basis by setting the S/F bit in the 
Transmit Command and the Receive Frame De- 
scriptor (see Figures 29, 30, 41 , and 42). The Simpli- 
fied memory structure offers a simple structure for 
ease of programming (see Figure 8). All information 
about a frame is contained in one structure; for ex- 
ample, during reception the RFD and data field are 
contained In one structure. 

The Flexible memory structure (see Figure 9) has a 
control field that allows the programmer to specify 
the amount of receive data the RFD will contain for 
receive operations and the amount of transmit data 
the Transmit Command Block will contain for trans- 
mit operations. For example, when the control field 
in the RFD is set to 20 bytes during a reception, the 
first 20 bytes of the data field are stored in the RFD 
(6 bytes of destination address, 6 bytes of source 
address, 2 bytes of length field, and 6 bytes of data) 
and the remainder of the data field is stored in the 
Receive Data Buffers. This is useful for capturing 
frame headers when header information Is con- 
tained in the data field. The header information can 
then be automatically stored In the RFD partitioned 
from the Receive Data Buffer. 

The control field can also be used for the Transmit 
Command when the Flexible memory structure is 
used. The quantity of data field bytes to be transmit- 
ted from the Transmit Command Block Is specified 
by the variable control field. 
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Figure 8. Simplified Memory Structure 
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TRANSMITTING FRAMES 

The 82596 executes high-level Action Commands 
from the Command List in system memory. Action 
Commands are fetched and executed in parallel with 
the host CPU operation, thereby significantly improv- 
ing system performance. The format of the Action 
Commands is shown in Figure 10. Figure 28 shows 
the 82586 mode, and Figures 29 and 30 show the 
command formats of the Linear and 32-bit Segment- 
ed modes. 


cated by the lack of a signal after the last bit of the 
frame check sequence field has been transmitted. In 
EOC mode the 82596 can be configured to extend 
short frames by adding pad bytes (7Eh) during trans- 
mission, according to the length field. In HDLC mode 
the 82596 will generate the 01111110 flag for the 
start and end frame delimiters, and do standard bit 
stuffing and stripping. Furthermore, the 82596 can 
be configured to pad frames shorter than the speci- 
fied minimum frame length by appending the appro- 
priate number of flags to the end of the frame. 


A single Transmit command contains, as part of the 
command-specific parameters, the destination ad- 
dress and length field of the transmitted frame and a 
pointer to buffer area in memory containing the data 
portion of the frame. The data field is contained in a 
memory data structure consisting of a buffer de- 
scriptor (BD) and a data buffer — or a linked list of 
buffer descriptors and buffers — as shown in Figure 
11 . 

Multiple data buffers can be chained together using 
the BDs. Thus, a frame with a long data field can be 
transmitted using several (shorter) data buffers 
chained together. This chaining technique allows the 
system designer to develop efficient buffer manage- 
ment. 

The 82596 automatically generates the preamble 
(alternating Is and Os) and start frame delimiter, 
fetches the destination address and length field from 
the Transmit command, inserts its unique address 
as the source address, fetches the data field speci- 
fied by the Transmit command, and computes and 
appends the CRC to the end of the frame (see Fig- 
ure 12). In the Linear and 32-bit Segmented mode 
the CRC can be optionally Inserted on a frame-by- 
frame basis by setting the NC bit in the Transmit 
Command Block (see Figures 29 and 30). 

The 82596 can be configured to generate two types 
of start and end frame delimiters — End of Carrier 
(EOC) or HDLC. In EOC mode the start frame delimi- 
ter Is 10101011 and the end frame delimiter Is indi- 


When a collision occurs, the 82596 manages the 
jam, random wait, and retry processes, reinitializing 
DMA pointers without CPU Intervention. Multiple 
frames can be sent by linking the appropriate num- 
ber of Transmit commands together. This is particu- 
larly useful when transmitting a message larger than 
the maximum frame size (1518 bytes for Ethernet). 


CONTROL 

COMMAND STATUS 

FIELDS 

COMMAND 


LINK FIELD 

(POINTER TO NEXT COMMAND) 


NEXT 

COMMAND 


PARAMETER FIELD 
(COMMAND-SPECIFIC 
PARAMETERS) 
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Figure 10. Action Command Format 
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Figure 11. Data Buffer Descriptor and 
Data Buffer Structure 
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Figure 12. Frame Format 
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RECEIVING FRAMES 

To reduce CPU overhead, the 82596 is designed to 
receive frames without CPU supervision. The host 
CPU first sets aside an adequate receive buffer 
space and then enables the 82596 Receive Unit. 
Once enabled, the RU watches for arriving frames 
and automatically stores them in the Receive Frame 
Area (RFA). The RFA contains Receive Frame De- 
scriptors, Receive Buffer Descriptors, and Data Buff- 
ers (see Figure 13). The individual Receive Frame 
Descriptors make up a Receive Descriptor List 
(RDL) used by the 82596 to store the destination 
and source addresses, the length field, and the 
status of each frame received (see Figure 1 4). 

Once enabled, the 82596 checks each passing 
frame for an address match. The 82596 will recog- 
nize its own unique address, one or more multicast 
addresses, or the broadcast address. If a match is 
found the 82596 stores the destination and source 
addresses and the length field in the next available 
RFD. It then begins filling the next available Data 
Buffer on the FBL, which is pointed to by the current 
RFD, with the data portion of the incoming frame. As 
one Data Buffer Is filled, the 82596 automatically 
fetches the next DB on the FBL until the entire frame 
is received. This buffer chaining technique is particu- 
larly memory efficient because it allows the system 
designer to set aside buffers to fit frames much 
shorter than the maximum allowable frame length. If 
AL-LOC = 1 , or if the flexible memory structure is 
used, the addresses and length field can be placed 
in the Receive Buffer. 

Once the entire frame is received without error, the 
82596 does the following housekeeping tasks. 

• The actual count field of the last Buffer Descrip- 
tor used to hold the frame just received is updat- 
ed with the number of bytes stored in the associ- 
ated Data Buffer. 

• The next available Receive Frame Descriptor Is 
fetched. 

• The address of the next available Buffer Descrip- 
tor is written to the next available Receive Frame 
Descriptor. 

• A frame received interrupt status bit is posted in 
the SOB. 

• An interrupt is sent to the CPU. 

If a frame error occurs, for example a CRC error, the 
82596 automatically reinitializes its DMA pointers 
and reclaims any data buffers containing the bad 


frame. The 82596 will continue to receive frames 
without CPU help as long as Receive Frame De- 
scriptors and Data Buffers are available. 


82596 NETWORK MANAGEMENT 
AND DIAGNOSTICS 

The behavior of data communication networks Is 
normally very complex because of their distributed 
and asynchronous nature. It is particularly difficult to 
pinpoint a failure when It occurs. The 82596 has ex- 
tensive diagnostic and network management func- 
tions that help improve reliability and testability. The 
82596 reports on the following events after each 
frame is transmitted. 

• Transmission successful. 

• Transmission unsuccessful. Lost Carrier Sense. 

• Transmission unsuccessful. Lost Clear to Send. 

• Transmission unsuccessful. A DMA underrun oc- 
curred because the system bus did not keep up 
with the transmission. 

• Transmission unsuccessful. The number of colli- 
sions exceeded the maximum allowed. 

• Number of Collisions. The number of collisions 
experienced during the frame. 

• Heartbeat Indicator. This indicates the presence 
of a heartbeat during the last Interframe Spacing 
(IFS) after transmission. 

When configured to Save Bad Frames the 82596 
checks each incoming frame and reports the follow- 
ing errors. 

• CRC error. Incorrect CRC in a properly aligned 
frame. 

• Alignment error. Incorrect CRC in a misaligned 
frame. 

• Frame too short. The frame is shorter than the 
value configured for minimum frame length. 

® Overrun. Part of the frame was not placed in 
memory because the system bus did not keep up 
with incoming data. 

• Out of buffer. Part of the frame was discarded 
because of insufficient memory storage space. 

• Receive collision. A collision was detected during 
reception. 

• Length error. A frame not matching the frame 
length parameter was detected. 
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Figure 13. Receive Frame Area Diagram 
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NETWORK PLANNING AND 
MAINTENANCE 

To properly plan, operate, and maintain a communi- 
cation network, the network management entity 
must accumulate information on network behavior. 
The 82596 provides a rich set of network-wide diag- 
nostics that can serve as the basis for a network 
management entity. 

Information on network activity is provided in the 
status of each frame transmitted. The 82596 reports 
the following activity indicators after each frame. 

• Number of collisions. The number of collisions 
the 82596 experienced while attempting to trans- 
mit the frame. 

• Deferred transmission. During the first transmis- 
sion attempt the 82596 had to defer to traffic on 
the link. 

The 82596 updates Its 32-blt statistical counters af- 
ter each received frame that both passes address 
filtering and is longer than the Minimum Frame 
Length configuration parameter. The 82596 reports 
the following statistics. 

• CRC errors. The number of well-aligned frames 
that experienced a CRC error. 

• Alignment errors. The number of misaligned 
frames that experienced a CRC error. 

• No resources. The number of frames that were 
discarded because of insufficient resources for 
reception. 

• Overrun errors. The number of frames that were 
not completely stored In memory because the 
system bus did not keep up with incoming data. 

•.Receive Collision counter. The number of colli- 
sions detected during receive. 

• Short Frame counter. The number of frames that 
were discarded because they were shorter than 
the configured minimum frame length. 

The 82596 can be configured to Promiscuous mode. 
In this mode it captures all frames transmitted on the 
network without checking the Destination Address. 
This is useful when implementing a monitoring sta- 
tion to capture all frames for analysis. 

A useful method of capturing frame headers is to 
use the Simplified memory mode, configure the 
82596 to Save Bad Frames, and configure the 
82596 to Promiscuous mode with space in the RFD 
allocated for specific number of receive data bytes. 


The 82596 will receive all frames and put them in the 
RFD. Frames that exceed the available space in the 
RFD will be truncated, the status will be updated, 
and the 82596 will retrieve the next RFD. This allows 
the user to capture the initial data bytes of each 
frame (for instance, the header) and discard the re- 
mainder of the frame. 

The 82596 also has a monitor mode for network 
analysis. During normal operation the receive func- 
tion enables the 82596 to receive frames that pass 
address filtering. These frames must have the Start 
of Frame Delimiter (SFD) field and must be longer 
than the absolute minimum frame length of 5 bytes 
(6 bytes in case of Multicast address filtering). Con- 
tents and status of the received frames are trans- 
ferred to memory. The monitor function enables the 
82596 to simply evaluate the incoming frames. The 
82596 can monitor the frames that pass or do not 
pass the address filtering. It can also monitor frames 
which do not have the SFD fields. The 82596 can be 
configured to only keep statistical information about 
monitor frames. Three options are available in the 
Monitor mode. These options are selected by the 
two monitor mode configuration bits available in the 
configuration command. 

When the first option is selected, the 82596 receives 
good frames that pass address filtering and trans- 
fers them to memory while monitoring frames that 
do not pass address filtering or are shorter than the 
minimum frame size (these frames are not trans- 
ferred to memory). When this option is used the 
82596 updates six counters: CRC errors, alignment 
errors, no resource errors, overrun errors, short 
frames and total good frames received. 

When the second option Is selected, the receive 
function is completely disabled. The 82596 monitors 
only those frames that pass address filterings and 
meet the minimum frame length requirement. When 
this option is used the 82596 updates six counters: 
CRC errors, alignment errors, total frames (good and 
bad), short frames, collisions detected and total 
good frames. 

When the third option is selected, the receive func- 
tion is completely disabled. The 82596 monitors all 
frames, including frames that do not have a Start 
Frame Delimiter. When this option Is used the 82596 
updates six counters: CRC errors, alignment errors, 
total frames (good and bad), short frames, collisions 
detected and total good frames. 
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STATION DIAGNOSTICS 
AND SELF-TEST 

The 82596 provides a large set of diagnostic and 
network management functions. These include Inter- 
nal and external loopback and time domain reflec- 
tometry for locating fault points In the network cable. 
The 82596 ensures software reliability by dumping 
the contents of the 82596 internal registers into sys- 
tem memory. The 82596 has a self-test mode that 
enables it to run an internal self-test and place the 
results In system memory. 


82586 SOFTWARE COMPATIBILITY 

The 82596 has a software-compatible state in which 
all its memory structures are compatible with the 
82586 memory structure. This includes all the Action 
Commands, the Receive Frame Area (including the 
RFD, Buffer Descriptors, and Data Buffers), the Sys- 
tem Control Block, and the Initialization procedures. 
There are two minor differences between the 82596 
in the 82586-Compatible memory structure and the 
82586. 

• When the internal and external loopback bits in 
the Configure command are set to 1 1 the 82596 
is in external loopback and the LPBK pin is acti- 
vated; in the 82586 this situation would produce 
internal loopback; 

• During a Dump command both the 82596 and 
82586 dump the same number of bytes; however, 
the data format is different. 


INITIALIZING THE 82596 

A Reset command is issued to the 82596 to prepare 
it for normal operation. The 82596 Is initialized 
through two data structures that are addressed by 
two pointers, the System Configuration Pointer 
(SCP) and the Intermediate System Configuration 
Pointer (ISCP). The initialization procedure begins 
when a Channel Attention signal is asserted after 
RESET. The 82596 uses the address of the double 
word that contains the SCP as a default — 
00FFFFF4h. Before the CA signal is asserted this 
default address can be chang ed to any other avail- 
able address by asserting the PORT pin and provid- 
ing the desired address over the D31 -D4 pins of the 
address bus. Pins D3-D0 must be 0010; I.e., any 
alternative address must be aligned to 16-byte 
boundaries. All addresses sent to the 82596 must be 
word aligned, which means that all pointers and 
memory structures must start on an even address 
(Ao = zero). 


SYSTEM CONFIGURATION POINTER 
(SCP) 

The SCP contains the sysbus byte and the location 
of the next structure of the initialization process, the 
ISCP. The following parameters are selected in the 
SYSBUS. 



• The 82596 operation mode. 

• The Bus Throttle timer triggering method. 

• Lock enabled. 


• Interrupt polarity. 

• Big Endian 32-bit entity mode. 


Byte ordering is determined by the LE/BE pin. 
LE/BE = 1 selects Little Endian byte ordering and 
LE/BE = 0 selects Big Endian byte ordering. 


NOTE: 

In the following, X indicates a bit not checked 
82586 mode. This bit must be set to 0 in ail other 
modes. 
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The following diagram illustrates the format of the SCP. 


31 ODD WORD 16 15 EVEN WORD 0 


X X 

X 

X 

X 

X 

X X 

SYSBUS 

0 

0 0 0 0 0 0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0FFFFF4h 

X X 


X 

X 

X 

X X 

X X 

X X X X X 

X 

X 

X X X X X X 

X 

X 


X 

X 


X 

X 


0FFFFF8h 

A31 .. 





..A24 

A23 




ISCP ADDRESS 









AO 

OFFFFFCh 


A31 A24 are not checked in 82586 mode 

X X areas are not checked in 82586 mode; they must be 0 in all other modes. 


A31 A24 are not checked in 82586 mode 

X X areas are not checked in 82586 mode; they must be 0 in all other modes. 


SYSBUS 


23 16 



□ 

INT 

LOCK 

TRG 


MO 



0- The 32-bit address pointers in Linear mode are treated ' 

as two 16-bit big endian entities. This is identical to 
the 82596 A1 stepping definition. 

1 - The 32-bit address pointers in Linear mode are treated 
as 32-bit big endian entities. This mode is only supported 
in the 82596 B stepping. In this mode the SCB absolute 
address and statistical counters are still treated as two 
16-bit big endian entities. 

Interrupt polarity 

0 - Interrupt pin Is active 

high 

1 - Interrupt pin is active 

low 



L 


; NOT CHECKED 


0 0 : 82586 mode 

0 1 : 32-Bit Segmented mode 

1 0 : Linear mode 
1 1 : Reserved 


' 0 : internal triggering of the 

Bus Throttle timers 
1 ; external triggering of the 
Bus Throttle timers 


0 : lock function enabled 

1 : Lock function disabled 
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ISCP ADDRESS — The physical address of the ISCP. In the 82586 mode, bits A31 ~A24 are considered to 
be zero. 


Figure 15. The System Configuration Pointer 


Writing the Sysbus 

When writing the sysbus byte it is important to pay attention to the byte order. 

• When a Little Endian processor is used, the sysbus byte is located at byte address OOFFFFFGh (or address 
77+2 if an alternative SCP address n was programmed). 

• When a processor using Big Endian byte ordering is used, the sysbus, alternative SCP, and ISCP addresses 
will be different. 

• The sysbus byte is located at OOFFFFFSh. 

• If an alternative SCP address is programmed, the sysbus byte should be at byte address /? + 1. 
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INTERMEDIATE SYSTEM CONFIGURATION POINTER (ISCP) 

The ISCP indicates the location of the System Control Block. Often the SCP is in ROM and the ISCP is in RAM. 
The CPU loads the SCB address (or an equivalent data structure) into the ISCP and asserts CA. This Channel 
Attention signal causes the 82596 to begin its initialization procedure and to get the SCB address from the 
ISCP and SCP. In 82586 and 32-bit Segmented modes the SCP base address is also the base address of all 
Command Blocks, Frame Descriptors, and Buffer Descriptors (but not buffers). All these data structures must 
reside in one 64-KB segment; however, in Linear mode no such limitation is imposed. 


The following diagram illustrates the ISCP format. 


31 


ODD WORD 


EVEN WORD 

16 15 87 


0 


A15 

SCB OFFSET 

AO 


BUSY 


A23 


SCB BASE ADDRESS 


AO 

X X 

t 

X X X X X X 

— in 82586 mode 






A31 A24 — in 32-bit segmented mode. 


BUSY — Indicates that the 82596 is being initialized. The CPU sets the ISCP to 01 h before It gives 
the first CA to the 82596. The ISCP Is cleared by the 82596 after the SCB base and offset 
are read. Note that the most significant byte of the first word of the ISCP Is not modified 
when BUSY is cleared. 

SCB OFFSET — This 16-bit quantity specifies the offset portion of the address of the SCB. 

SCB BASE — Specifies the base portion of the address of the SCB. The base of SCB is also the base of 
all 82596 Command Blocks, Frame Descriptors and Buffer Descriptors. In the 82586 
mode, bits A31 -A24 are considered to be zero. 


Figure 16. The Intermediate System Configuration Pointer— -82586 and 32-Bit Segmented Modes 
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BUSY — Indicates that the 82596 is being initialized. The ISCP Is set to 01 h by the CPU before its 

first CA to the 82596. It is cleared by the 82596 after the SCB address is read. 

SCB ADDRESS— This 32-blt quantity specifies the physical address of the SCB. 

Figure 17. The Intermediate System Configuration Pointer — Linear Mode. 


INITIALIZATION PROCESS 

The CPU sets up the SCP, ISCP, and the SCB structures, and, if desired, an alternative SCP address. It also 
sets BUSY to 01 h. The 82596 Is initialized when a Channel Attention signal follows a Reset signal, causing the 
82596 to access the System Configuration Pointer. The sysbus byte, the operational mode, the bus throttle 
timer triggering method, the interrupt polarity, and the state of LOCK are read. After reset the Bus Throttle 
timers are essentially disabled — the T-ON value is infinite, the T-OFF value is zero. After the SCP is read, the 
82596 reads the ISCP and saves the SCB address. In 82586 and 32-bit Segmented modes this address is 
represented as a base address plus the offset (this base address is also the base address of all the control 
blocks). In Linear mode the base address is also an absolute address. The 82596 clears BUSY, sets CX and 
CNR to equal 1 in the SCB, clears the SCB command word, sends an interrupt to the CPU, and awaits another 
Channel Attention signal. RESET configures the 82596 to its default state before CA is asserted. 
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CONTROLLING THE 82596CA 

The host CPU controls the 82596 with the commands, data structures, and methods described in this section. 
The CPU and the 82596 communicate through shared memory structures. The 82596 contains two indepen- 
dent units: the Command Unit and the Receive Unit. The Command Unit executes commands from the CPU, 
and the Receive Unit handles frame reception: These two units are controlled and monitored by the CPU 
through a shared memory structure called the System Control Block (SCB). The CPU and the 82596 use the 
CA and I NT signals to communicate with the SCB. 


82596 CPU ACCESS INTERFACE (PORT) 

The 82596 has a CPU access interface that allows the host CPU to do four things. 

• Write an alternative System Configuration Pointer address. 

• Write an alternative Dump area pointer and perform Dump. 

• Execute a software reset. 

• Execute a self-test. 

The following events initiate the CPU access state. 

• Presence of an address on the D 31 -D 4 data bus pins. 

• The D 3 -D 0 pins are used to select one of the four functions, 

• The PORT input pin Is asserted, as in a regular write cycle. 

NOTE. 

The SCP Dump and Self-Test addresses must be 16-byte aligned. 

The 82596 requires two 1 6-bit write cycles for a port co mmand. The first write holds the internal machines and 
reads the first 16 bits; the second activates the PORT command and reads the second 16 bits. 

The PORT Reset Is useful when only the 82596 needs to be reset. The CPU must wait for 1 0-system and 5-se- 
rial clocks before issuing another CA to the 82596; this new CA begins a new initialization process. 

The Dump function Is useful for troubleshooting No Respo nse problems. If the chip Is in a No Response state, 
the PORT Dump operation can be executed and a PORT Reset can be used to reinitialize the 82596 without 
disturbing the rest of the system. 

The Self-Test function can be used for board testing; the 82596 will execute a self-test and write the results to 
memory. 


Table 2. PORT Function Selection 



031 


DA 




. . DO 



Function 

Addresses and Results 

D3 

D2 

D1 

DO 

Reset 

A31 

Don’t Care 

A4 

0 

0 

0 

0 

Self-Test 

A31 

Self-Test Results Address 

A4 

0 

0 

0 

1 

SCP 

A31 

Alternative SCP Address 

A4 

0 

0 

1 

0 

Dump 

A31 

Dump Area Pointer 

A4 

0 

0 

1 

1 


MEMORY ADDRESSING FORMATS 

The 82596 accesses memory by 32-bit addresses. There are two types of 32-bit addresses: linear and seg- 
mented. The type of address used depends on the 82596 operating mode and the type of memory structure It 
is addressing. The 82596 has three operating modes. 


4-82 





82596CA 




ini^. 


• 82586 Mode 

• A Linear address is a single 24-bit entity. Address pins A 31 -A 24 are always zero. 

• A Segmented address uses a 24-bit base and a 1 6-bit offset. 

• 32-bit Segmented Mode 

• A Linear address is a single 32-bit entity. 

• A Segmented address uses a 32-bit base and a 16-bit offset. 

NOTE: 

In the previous two memory addressing modes, each command header (CB, TBD, RFD, RBD, and SCB) 
must wholly reside within one segment. If the 82596 encounters a memory structure that does not follow this 
restriction, the 82596 will fetch the next contiguous location in memory (beyond the segment). 

• Linear Mode 

• A Linear address is a single 32-bit entity. 

• There are no Segmented addresses. 

Linear addresses are primarily used to address transmit and receive data buffers. In the 82586 and 32-bit 
Segmented modes, segmented addresses (base plus offset) are used for all Command Blocks, Buffer Descrip- 
tors, Frame Descriptors, and System Control Blocks. When using Segmented addresses, only the offset 
portion of the entity being addressed is specified in the block. The base for all offsets is the same — that of the 
SCB. See Table 1. 


LITTLE ENDIAN AND BIG ENDIAN BYTE ORDERING 

The 82596 supports both Little Endian and Big Endian byte ordering for its memory structures. 

The 82596 A1 stepping supports Big Endian byte ordering for word and byte entitles. Dword entities are not 
supported with 82596 A1 Big Endian byte ordering. This results In slightly different 82596A1 memory struc- 
tures for Big Endian operation. These structures are defined In the 32 LAN Components Users Manual. 



The 82596 B stepping supports Big Endian byte ordering for Linear mode only. All 82596 B 32-bit address 
pointers are treated as 32-bit Big Endian entities, however, the SCB absolute address and statistical counters 
are treated as two 1 6-blt Big Endian entities. This 32-bit Big Endian entity support is configured through bit 7 in 
the SYSBUS byte. 


NOTE: 

All 82596 memory entities must be word or dword aligned, except the transmit buffers can be byte aligned 
for the 82596 B-Stepping. 

An example of a dword entity is a frame descriptor command/status dword, whereas the raw data of the frame 
are byte entities. Both 32- and 1 6-bit buses are supported. When a 1 6-bit bus Is used with Big Endian memory 
organization, data lines D 15 -D 0 are used. The 82596 has an Internal crossover that handles these swap 
operations. 


COMMAND UNIT (CU) 

The Command Unit is the logical unit that executes Action Commands from a list of commands very similar to 
a CPU program. A Command Block is associated with each Action Command. The CU is modeled as a logical 
machine that takes, at any given time, one of the following states. 

• Idle. The CU is not executing a command and is not associated with a CB on the list. This is the initial state. 

• Suspended. The CU is not executing a command; however, it Is associated with a CB on the list. 

• Active. The CU is executing an Action Command and pointing to its CB. 
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The CPU can affect CU operation in two ways; by issuing a CU Control Command or by setting bits in the 
Command word of the Action Command. 


RECEIVE UNIT (RU) 

The Receive Unit is the logical unit that receives frames and stores them in memory. The RU is modeled as a 
logical machine that takes, at any given time, one of the following sfaites. 

• Idle. The RU has no memory resources and Is discarding incoming frames. This is the initial state. 

• No Resources. The RU has no memory resources and is discarding incoming frames. This state differs 
from Idle In that the RU accumulates statistics on the number of discarded frames. 

• Suspended. The RU has memory available for storing frames, but is discarding them. The suspend state 
can only be reached If the CPU forces this through the SCB or sets the suspend bit in the RFD. 

• Ready. The RU has memory available and is storing incoming frames. 

The CPU can affect RU operation in three ways: by issuing an RU Control Command, by setting bits in the 
Frame Descriptor Command word of the frame being received, or by setting the EL bit of the current buffer’s 
Buffer Descriptor. 


SYSTEM CONTROL BLOCK (SCB) 

The SCB is a memory block that plays a major role in communications between the CPU and the 82596. Such 
communications include the following. 

• Commands issued by the CPU 

• Status reported by the 82596 

Control commands are sent to the 82596 by writing them into the SCB and then asserting CA. The 82596 
examines the command, performs the required action, and then clears the SCB command word. Control 
commands perform the following types of tasks. 

• Operation of the Command Unit (CU). The SCB controls the CU by specifying the address of the Command 
Block List (CBL) and by starting, suspending, resuming, or aborting execution of CBL commands. 

• Operation of the Bus Throttle. The SCB controls the Bus Throttle timers by providing them with new values 
and sending the Load and Start timer commands. The timers can be operated in both the 32-blt Segmented 
and Linear modes. 

• Reception of frames by the Receive Unit (RU). The SCB controls the RU by specifying the address of the 
Receive Frame Area and by starting, suspending, resuming, or aborting frame reception. 

• Acknowledgment of events that cause interrupts. 

• Resetting the chip. 

The 82596 sends status reports to the CPU via the System Control Block. The SCB contains four types of 
status reports. 

• The cause of the current interrupts. These interrupts are caused by one or more of the following 82596 
events. 

• The Command Unit completes an Action Command that has its I bit set. 

• The Receive Unit receives a frame. 

• The Command Unit becomes inactive. 

• The Receive Unit becomes not ready. 

• The status of the Command Unit. 

• The status of the Receive Unit. 

• Status reports from the 82596 regarding reception of corrupted frames. 
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Events can be cleared only by CPU acknowledgment. If some events are not acknowledged by the ACK field 
the Interrupt signal (INT) will be reissued after Channel Attention (CA) is processed. Furthermore, if a new 
event occurs while an interrupt is set, the interrupt is temporarily cleared to trigger edge-triggered interrupt 
controllers. 

The CPU uses the Channel Attention line to cause the 82596 to examine the SCB. This signal is trailing-edge 
triggered — the 82596 latches CA on the trailing edge. The latch is cleared by the 82596 before the SCB 
control command is read. 


31 


ODD WORD 


16 15 


EVEN WORD 


' ACK ' X cue R RUC X X X X 

III II II 

'sTAT' 0 CUS 0 RUS 0000 

III II II 

RFA OFFSET 

CBL OFFSET 

ALIGNMENT ERRORS 

CRC ERRORS 

OVERRUN ERRORS 

RESOURCE ERRORS 


SCB + 4 
SCB + 8 
SCB + 12 


Figure 18. SCB — 82586 Mode 
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*ln monitor mode these counters change function 


Figure 19. SCB— 32-Bit Segmented Mode 
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*ln MONITOR mode these counters change function 


Figure 20. SCB — Linear Mode 
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These bits specify the action to be performed as a result of a CA. This word is set by the CPU and cleared by 
the 82596. Defined bits are: 

Bit 31 ACK-CX — Acknowledges that the CU completed an Action Command. 

Bit 30 ACK-FR — Acknowledges that the RU received a frame. 

Bit 29 ACK-CNA — Acknowledges that the Command Unit became not active. 

Bit 28 ACK-RNR — Acknowledges that the Receive Unit became not ready. 

Bits 24-26 cue — (3 bits) This field contains the command to the Command Unit. Valid values are: 

0 — NOP (does not affect current state of the unit). 

1 — Start execution of the first command on the CBL. If a command is executing, 

complete it before starting the new CBL. The beginning of the CBL is in CBL 
OFFSET (address). 

2 — Resume the operation of the Command Unit by executing the next command. 

This operation assumes that the' Command Unit has been previously sus- 
pended. 

3 — Suspend execution of commands on CBL after current command is complete. 

4 — Abort current command immediately. 

5 — Loads the Bus Throttle timers so they will be initialized with their new values 

after the active timer (T-ON or T-OFF) reaches Terminal Count. If no timer is 
active new values will be loaded immediately. This command Is not valid in 
82586 mode. 

6 — Loads and immediately restarts the Bus Throttle timers with their new values. 

This command is not valid in 82586 mode. 

7 — Reserved. 

Bit 23 RESET — - Reset chip (logically the same as hardware RESET). 

Bits 20-22 RUC — (3 bits) This field contains the command to the Receive Unit. Valid values are: 

0 — NOP (does not alter current state of unit). 

1 — Start reception of frames. The beginning of the RFA is contained in the RFA 

OFFSET (address). If a frame is being received complete reception before 
starting. 

2 — Resume frame reception (only when In suspended state). 

3 — Suspend frame reception. If a frame is being received complete its reception 

before suspending. 

4 — Abort receiver operation Immediately. 

5-7 — Reserved. 
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Status Word 

15 0 



32-Bit Segmented and Linear mode. 


Indicates the status of the 82596. This word is modified only by the 82596. Defined bits are: 

Bit 15 CX — The CU finished executing a command with its / (interrupt) bit set. 

Bit 14 FR — The RU finished receiving a frame. 

Bit 13 CNA — The Command Unit left the Active state. 

Bit 12 RNR — The Receive Unit left the Ready state. 

Bits 8-10 CUS — (3 bits) This field contains the status of the command unit. Valid values are: 

0 — Idle 

1 — Suspended 

2 — Active 
3-7 — Not used 

Bits 4-7 RUS — This field contains the status of the receive unit. Valid values are: 

Oh (0000) —Idle 
1 h (0001 ) — Suspended 

2h(0010) — No Resources. This bit indicates both no resources due to lack of 
RFDs in the RDL and no resources due to lack of RBDs in the FBL. 

4h(0100) —Ready 

Ah (1010) — No resources due to no more RBDs (not in the 82586 mode). 

Ch (1100) — No more RBDs (not in 82586 mode) 

No other combinations are allowed 

Bit 3 T — Bus Throttle timers loaded (not in 82586 mode). 


SCB OFFSET ADDRESSES 


CBL Offset (Address) 

In 82586 and 32-bit Segmented modes this 16-bit quantity indicates the offset portion of the address for the 
first Command Block on the CBL. In Linear mode it is a 32-bit linear address for the first Command Block on 
the CBL. It is accessed only if CUC equals Start. 


RFA Offset (Address) 

In 82586 and 32-bit Segmented modes this 16-bit quantity indicates the offset portion of the address for the 
Receive Frame Area. In Linear mode it is a 32-bit linear address for the Receive Frame Area. It is accessed 
only if RUC equals Start. 
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SCB STATISTICAL COUNTERS 


Statistical Counter Operation 

• The CPU is responsible for clearing all error counters before initializing the 82596. The 82596 updates 
these counters by reading them, adding 1 , and then writing them back to the SCB. 

• The counters are wraparound counters. After reaching FFFFFFFFh the counters wrap around to zero. 

• The 82596 updates the required counters for each frame. It is possible for more than one counter to be 
updated; multiple errors will result in all affected counters being updated. 

® The 82596 executes the read-counter/increment/write-counter operation without relinquishing the bus 
(locked operation). This is to ensure that no logical contention exists between the 82596 and the CPU due 
to both attempting to write to the counters simultaneo usly. In the dual-port memory configuration the CPU 
should not execute any write operation to a counter if LOCK Is asserted. 

• The counters are 32-bits wide and their behavior is fully compatible with the IEEE 802.3 standard. The 
82596 supports all relevant statistics (mandatory, optional, and desired) through the status of the transmit 
and receive header and directly through SCB statistics. 


CRCERRS 

This 32-bit quantity contains the number of aligned frames discarded because of a CRC error. This counter is 
updated, if needed, regardless of the RU state. 


ALNERRS 

This 32-bit quantity contains the number of frames that both are misaligned (i.e., where CRS deasserts on a 
nonoctet boundary) and contain a CRC error. The counter is updated, if needed, regardless of the RU state. 


SHRTFRM 

This 32-bit quantity contains the number of received frames shorter than the minimum frame length. 
The last three counters change function in monitor mode. 


RSCERRS 

This 32-blt quantity contains the number of good frames discarded because there were no resources to 
contain them. Frames intended for a host whose RU Is in the No Receive Resources state, fall into this 
category. This counter Is updated only if the RU is in the No Resources state. When in Monitor mode this 
counter counts the total number of frames — good and bad. 
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OVRNERRS 

This 32-bit quantity contains the number of frames known to be lost because the local system bus was not 
available. If the traffic problem lasts longer than the duration of one frame, the frames that follow the first are 
lost without an Indicator, and they are not counted. This counter is updated, if needed, regardless of the RU 
state. 


RCVCDT 

This 32-bit quantity contains the number of collisions detected during frame reception. In Monitor mode this 
counter counts the total number of good frames. 


ACTION COMMANDS AND OPERATING MODES 


This section lists all the Action Commands of the Command Unit Command Block List (CBL). Each command 
contains the Command field, the Status and Control fields, the link to the next Action Command, and any 
command-specific parameters. There are three basic types of action commands: 82596 Configuration and 
Setup, Transmission, and Diagnostics. The following is a list of the actual commands. 

• NOP . • Transmit 

• Individual Address Setup • TDR 

• Configure • Dump 

• MC Setup • Diagnose 


The 82596 has three addressing modes. In the 82586 mode all the Action Commands look exactly like those 
of the 82586. 

• 82586 Mode. The 82596 software and memory structure is compatible with the 82586. 



• 32-Bit Segmented Mode. The 82596 can access the entire system memory and use the two new memory 
structures — Simplified and Flexible — while still using the segmented approach. This does not require any 
significant changes to existing software. 

• Linear Mode. The 82596 operates in a flat, linear, 4 gigabyte memory space without segmentation. It can 
also use the two new memory structures. 


In the 32-bit Segmented mode there are some differences between the 82596 and 82586 action commands, 
mainly in programming and activating new 82596 features. Those bits marked “don’t care’’ In the compatible 
mode are not checked; however, we strongly recommend that those bits all be zeroes; this will allow future 
enchancements and extensions. 


In the Linear mode all of the address offsets become 32-bit address pointers. All new 82596 features are 
accessible in this mode, and all bits previously marked “don’t care’’ must be zeroes. 

The Action Commands, and all other 82596 memory structures, must begin on even byte boundaries, i.e., they 
must be word aligned. 
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NOP 

This command results in no action by the 82596 except for those performed in the normal command process- 
ing. It is used to manipulate the CBL manipulation. The format of the NOP command is shown in Figure 21 . 







NOP~82586 and 32-Bit Segmented Modes 
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NOP— Linear Mode 
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Figure 21 


where; 

LINK POINTER — In the 82586 or 32-bit Segmented modes this is a 16-bit offset to the next Command 
Block. In the Linear mode this is the 32-bit address of the next Command Block. 

EL ~ If set, this bit indicates that this command block Is the last on the CBL. 

S — If set to one, suspend the CU upon completion of this CB. 

I — If set to one, the 82596 will generate an interrupt after execution of the command is 

complete. If I is not set to one, the CX bit will not be set. 

CMD (bits 16-18) — The NOP command. Value: Oh. 

Bits 19-28 — Reserved (zero In the 32-bit Segmented and Linear modes). 

C — This bit indicates the execution status of the command. The CPU initially resets It to zero 

when the Command Block is placed on the CBL. Following a command Completion, the 
82596 will set it to one. 

B — This bit indicates that the 82596 Is currently executing the NOP command. It is initially 

reset to zero by the CPU. The 82596 sets it to one when execution begins and to zero 
when execution Is completed. This bit is also set when the 82596 prefetches the com- 
mand. 

NOTE: 

The C and B bits are modified in one operation. 

OK — Indicates that the command was executed without error. If set to one no error occurred 

(command executed OK). If zero an error occured. 


Individual Address Setup 


This command is used to load the 82596 with the Individual Address. This address is used by the 82596 for 
inserting the Source Address during transmission and recognizing the Destination Address during reception. 
After RESET, and prior to Individual Address Setup Command execution, the 82596 assumes the Broadcast 
Address is the Individual Address in ail aspects, i.e.: 

• This will be the Individual Address Match reference. 


• This will be the Source Address of a transmitted frame (for AL-LOC = 0 mode only). 
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The format of the Individual Address Setup command Is shown in Figure 22. 


lA Setup — 82586 and 32-Bit Segmented Modes 
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Figure 22 


where: 


LINK ADDRESS, — As per standard Command Block (see the NOP command for details) 
EL, B, C, I, S 


A — Indicates that the command was abnormally terminated due to CL) Abort control 

command. If one, then the command was aborted, and If necessary It should be 
repeated. If this bit is zero, the command was not aborted. 


Bits 19-28 — Reserved (zero In the 32-bit Segmented and Linear modes). 

CMD (bits 16-18) — The Address Setup command. Value: 1h. 

INDIVIDUAL ADDRESS — The Individual address of the node, 0 to 6 bytes long. 



The least significant bit of the Individual Address must be zero for Ethernet (see the Command Structure). 
However, no enforcement of 0 is provided by the 82596. Thus, an Individual Address with 1 as its least 
significant bit is a valid Individual Address in ail aspects. 

The default address length is 6 bytes long, as In 802.3. If a different length is used the lA Setup command 
should be executed after the Configure command. 


Configure 

The Configure command loads the 82596 with its operating parameters. It allows changing some of the 
parameters by specifying a byte count less than the maximum number of configuration bytes (1 1 in the 82586 
mode, 14 in the 32-Bit Segmented and Linear modes). The 82596 configuration depends on Its mode of 
operation. When configuring the 1 2th byte (Byte 1 1 undefined) In 82586 mode this byte should be all ones. 

• In the 82586 mode the maximum number of configuration bytes is 12. Any number larger than 12 will be 
reduced to 1 2 and any number less than 4 will be Increased to 4. 

• The additional features of the serial side are disabled in the 82586 mode. 

• In both the 32-Bit Segmented and Linear modes there are four additional configuration bytes, which hold 
parameters for additional 82596 features. If these parameters are not accessed, the 82596 will follow their 
default values. 

• For more detailed information refer to the 32-Bit LAN Components User’s Manual. 
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Figure 25. CONFIGURE— Linear Mode 


LINK ADDRESS, — As per standard Command Block (see the NOP command for details) 

EL, B, C, I, S 

A — Indicates that the command was abnormally terminated due to a CD Abort control com- 

mand. If 1 , then the command was aborted and If necessary it should be repeated. If this 
bit is 0, the command was not aborted. 

Bits 19-28 — Reserved (zero in the 32-Bit Segmented and Linear Modes) 

CMD (bits 16-18) — The CONFIGURE command. Value: 2h. 


The interpretation of the fields follows: 

7 6 5 4 3 2 1 0 


BYTE COUNT 


BYTEO 

BYTE CNT (Bits 0-3) Byte Count. Number of bytes, including this one, that hold pa- 

rameters to be configured. 

PREFETCHED (Bit 7) Enable the 82596 to write the prefetched bit In all prefetch 

RBDs. 
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NOTE: 

The P bit is valid only in the new memory structure modes. In 82586 mode this bit is disabled (i.e., no 
prefetched mark). 


7 0 


MONITOR 

1 

X 

X 

FIFO LIMIT 

1 1 \ 


BYTE 1 

FIFO Limit (Bits 0-3) FIFO limit. 

MONITOR# (Bits 6-7) Receive monitor options. If the Byte Count of the configure 

command is less than 12 bytes then these Monitor bits are ignored. 

DEFAULT: C8h 


7 0 


SAVBF 

1 

0 

0 

0 

0 

RESUME_RD 

0 


BYTE 2 

SAV BF (Bit 7) 0— Received bad frames are not saved in the memory. 

1 — Received bad frames are saved in the memory. 

DEFAULT: 40h 

RESUME RD (Bit 1) 0 — The 82596 does not reread the next CB on the list when a CU Resume 

Control Command is issued. 

1 — The 82596 will reread the next CB on the list when a CU Resume 
' Control Command is issued. This is available only on the 82596B step- 

ping. 


7 0 


LOOPBACK 

MODE 

1 

1 

PREAMBLE LENGTH 

1 

NO SRC 
ADD INS 

1 1 

ADDRESS LENGTH 


BYTE 3 

ADR LEN (Bits 0-2) 

NO SCR ADD INS (Bit 3) 

PREAM LEN (Bits 4-5) 
LP BCK MODE (Bits 6-7) 
DEFAULT: 26h 


Address length (any kind). 

No Source Address Insertion. 

In the 82586 this bit is called AL LOC. 

Preamble length. 

Loopback mode. 


7 0 


BOF METD 

EXPONENTIAL PRIORITY 

1 1 

0 

LINEAR PRIORITY 

1 1 


BYTE 4 

LIN PRIO (Bits 0-2) 
EXP PRIO (Bits 4-6) 
BOF METD (Bit 7) 
DEFAULT: OOh 


Linear Priority. 

Exponential Priority. 
Exponential Backoff method. 


7 0 

' ' ' ' INTER FRAME SPACING ' 

I I I L_ I I 1 

BYTE 5 

INTERFRAME SPACING Interframe spacing. 

DEFAULT: 60h 
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7 0 

1 1 1 1 1 1 1 

SLOT TIME -LOW 

I I I -L I I I 

BYTES 

SLOT TIME (L) Slot time, low byte. 

DEFAULT: OOh 


7 


0 


MAXIMUM RETRY NUMBER 

I i I 


0 SLOT TIME -HIGH 


BYTE? 

SLOT TIME (H) Slot time, high part. 

(Bits 0-2) 

RETRY NUM (Bits 4-7) Number of transmission retries on collision. 


DEFAULT: F2h 

7 0 


PAD 

BIT 

CRC16/ 

NO CRC 

TONO 

MAN/ 

BC 

PRM 

STUFF 

CRC32 

INSER 

CRS 

NRZ 

DIS 

MODE 


BYTES 
PRM (Bit 0) 

BC DIS (Bit1) 
MANCH/NRZ (Bit 2) 

TONO CRS (Bit 3) 
NOCRCINS (Bit 4) 
CRC-16/CRC-32 (Bit 5) 
BIT STF (Bit 6) 

PAD (Bit 7) 

DEFAULT: OOh 


Promiscuous mode. 

Broadcast disable. 

Manchester or NRZ encoding. See specific timing require- 
ments for TXC In Manchester mode. 

Transmit on no CRS. 

No CRC insertion. 

CRC type. 

Bit stuffing. 

Padding. 


7 0 


CDT SRC 

COLLISION DETECt'fILTER 

I 1 

CRS SRC 

CARRIER SENSE FILTER 


BYTE 9 

CRSF (Bits 0-2) 
CRS SRC (Bit 3) 
CDTF (Bits 4-6) 
CDT SRC (Bit 7) 
DEFAULT: OOh 


Carrier Sense filter (length). 
Carrier Sense source. 
Collision Detect filter (length). 
Collision Detect source. 
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BYTE 10 

MIN FRAME LEN Minimum frame length. 

DEFAULT: 40h 



BYTE 11 

PRECRS (Bit 0) Preamble until Carrier Sense 

LNGFLD (Bit 1) Length field. Enables padding at the End-of-Carrier framing (802.3). 

CRCINM (Bit 2) Rx CRC appended to the frame in memory. 

AUTOTX (Bit 3) Auto retransmit when a collision occurs during the preamble. 

CDBSAC (Bit 4) Collision Detect by source address recognition. 

MC ALL (Bit 5) Enable to receive all MC frames. 

MONITOR (Bits 6-7) Receive monitor options. 

DEFAULT: FFH 



BYTE 12 

FDX (Bit 6) Enables Full Duplex operation. 

DEFAULT: OOh 



BYTE 13 

MULT lA (Bit 6) Multiple individual address. 

DIS BOF (Bit 7) Disable the backoff algorithm. 

DEFAULT: 3Fh 
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A reset (hardware or software) configures the 82596 according to the following defaults. 


Table 4. Configuration Defaults 



Parameter 

Default Value 

Units/Meaning 


ADDRESS LENGTH 

*♦6 

Bytes 


A/L FIELD LOCATION 

0 

Located in FD 

♦ 

AUTO RETRANSMIT 

1 

Auto Retransmit Enable 


BITSTUFFING/EOC 

0 

EOC 


BROADCAST DISABLE 

0 

Broadcast Reception Enabled 

♦ 

CDBSAC 

1 

Disabled 


CDT FILTER 

0 

Bit Times 


CDT SRC 

0 

External Collision Detection 

♦ 

CRC IN MEMORY 

1 

CRC Not Transferred to Memory 


CRC-16/CRC-32 

♦*o 

CRC-32 


CRS FILTER 

0 

0 Bit Times 


CRS SRC 

0 

External CRS 

* 

DISBOF 

0 

Backoff Enabled 


EXT LOOPBACK 

0 

Disabled 


EXPONENTIAL PRIORITY 


802.3 Algorithm 


EXPONENTIAL BACKOFF METHOD 


802.3 Algorithm 

* 

FULL DUPLEX (FDX) 

0 

CSMA/CD Protocol (No FDX) 


FIFO THRESHOLD 

8 

TX: 32 Bytes, RX: 64 Bytes 


INT LOOPBACK 

0 

Disabled 


INTERFRAME SPACING 

»«96 

Bit Times 


LINEAR PRIORITY 


802.3 Algorithm 

* 

LENGTH FIELD 

1 

Padding Disabled 


MIN FRAME LENGTH 

♦♦64 

Bytes 

* 

MCALL 

1 

Disabled 

* 

MONITOR 

11 

Disabled 


MANCHESTER/NRZ 

0 

NRZ 

* 

MULTI lA 

0 

Disabled 


NUMBER OF RETRIES 

♦♦15 

Maximum Number of Retries 


NO CRC INSERTION 

0 

CRC Appended to Frame 


PREFETCH BIT IN RBD 

0 

Disabled (Valid Only In New Modes) 


PREAMBLE LENGTH 


Bytes 

* 

Preamble Until CRS 

1 

Disabled 


PROMISCUOUS MODE 

0 

Address Filter On 


PADDING 

0 

No Padding 


SLOT TIME 

♦♦512 

Bit Times 


SAVE BAD FRAME 

0 

Discards Bad Frames 


TRANSMIT ON NO CRS 

0 

Disabled 


NOTES: 

1 . This configuration setup is compatible with the IEEE 802.3 specification. 

2. The Asterisk signifies a new configuration parameter not available in the 82586. 

3. The default value of the Auto retransmit configuration parameter is enabledf^). 

4. Double Asterisk signifies IEEE 802.3 requirements. 
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Multicast-Setup 

This command is used to load the 82596 with the Multicast-IDs that should be accepted. As noted previously, 
the filtering done on the Multicast-IDs is not perfect and some unwanted frames may be accepted. This 
command resets the current filter and reloads it with the specified Multicast-IDs. The format of the Multicast- 
addresses setup command is: 



Figure 26. MC Setup— 82586 and 32-Bit Segmented Modes 



Figure 27. MC Setup— Linear Mode 


where: 

LINK ADDRESS, 
EL, B, C, I, S 

A 

Bits 19-28 
CMD (bits 16-18) 
MC-CNT 


MC LIST 


— As per standard Command Block (see the NOP command for details) 

— Indicates that the command was abnormally terminated due to a CU Abort control 
command. If one, then the command was aborted and if necessary it should be 
repeated. If this bit is zero, the command was not aborted. 

— Reserved (0 in both the 32-Bit Segmented and Linear Modes). 

— The MC SETUP command value: 3h. 

This 14-bit field indicates the number of bytes in the MC LIST field. The MC CNT 
must be a multiple of the ADDR LEN; otherwise, the 82596 reduces the MC CNT to 
the nearest ADDR LEN multiple. MC CNT = 0 implies resetting the Hash table 
which is equivalent to disabling the Multicast filtering mechanism. 

— A list of Multicast Addresses to be accepted by the 82596. The least significant bit 
of each MC address must be 1 . 


NOTE: 

The list is sequential; i.e., the most significant byte of an address is immediately followed by the least signifi- 
cant byte of the next address. 

— When the 82596 is configured to recognize multiple Individual Address (Multl-IA), 
the MC-Setup command is also used to set up the Hash table for the individual 
address. 

The least significant bit in the first byte of each lA address must be 0. 
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Transmit 

This command is used to transmit a frame of user data onto the serial link. The format of a Transmit command 
is as follows. 


31 



ODD WORD 

16 15 


EVEN WORD 

0 


EL 

s 


XXXXXXXXXX 

1 0 0 

C 


STATUS BITS 

MAXCOLL 

0 

A15 


TBD OFFSET 

AO 

A15 


LINK OFFSET 

AO 

4 

4th byte 

DESTINATION ADDRESS 


1 St byte 

8 

LENGTH FIELD 

6th byte 

12 


Figure 28. TRANSMIT— 82586 Mode 


31 ODD WORD 16 15 EVEN WORD 0 


ELS 1 00000000 NCSF1 00 

C 

B 

STATUS BITS MAXCOLL 

A15 TBD OFFSET AO 

A15 LINK OFFSET AO 

0000000000000000 

EOF 

0 

TCB COUNT 

4th byte 

DESTINATION ADDRESS 1st byte 

LENGTH FIELD 

6th byte 

OPTIONAL DATA 


Figure 29. TRANSMIT— 32-Bit Segmented Mode 


31 ODD WORD 16 15 EVEN WORD 0 


ELS 1 00000000 NCSF1 00 


B 

STATUS BITS MAXCOLL 

A31 LINK ADDRESS AO 

A31 TRANSMIT BUFFER DESCRIPTOR ADDRESS AO 

0000000000000000 

EOF 

0 

TCB COUNT 

4th byte 

DESTINATION ADDRESS 1st byte 

LENGTH FIELD 

6th byte 

OPTIONAL DATA 


Figure 30. TRANSMIT— Linear Mode 


31 COMMAND WORD 16 


EL 

S 

□ 

0 

0 

0 

0 

0 

0 

0 

0 

NC 

SF 

1 0 0 


T 

0: No CRC Insertion disable; when the 
configure command is configured to 
not insert the GRG during 
transmission the NC bit has no 
effect. 

1 : No CRC Insertion enable; when the 
configure command is configured to 
insert the CRC during transmission 
the CRC will not be inserted when 
NC = 1. 


Simplified Mode, all the Tx data is in 
the Transmit Command Block. The 
T ransmit Buffer Descriptor Address 
field is all Is. 

1 : Flexible Mode. Data is in the TCB and 
in a linked list of TBDs. 


T 

0: 
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where: 

EL. B, C, I, S 
OK (Bit 13) 

A (Bit 12) 


Bits 19-28 
CMD (Bits 16-18) 
Status Bit 1 1 
Status Bit 1 0 


Status Bit 9 
Status Bit 8 

Status Bit 7 

Status Bit 6 


Status Bit 5 

Status Bit 4 

MAX-COL 
(Bits 3-0) 

LINK OFFSET 
TBD POINTER 


DEST ADDRESS 


LENGTH FIELD 

TCB COUNT 


EOF Bit 


— As per standard Command Block (see the NOP command for details). 

— Error free completion. 

— Indicates that the command was abnormally terminated due to CU Abort control 
command. If 1, then the command was aborted, and if necessary it should be 
repeated. If this bit is 0, the command was not aborted. 

— Reserved (0 in the 32-bit Segmented and Linear modes). 

— The transmit command; 4h. 


— Late collision. A late collision (a collision after the slot time is elapsed) Is detected. 

— No Carrier Sense signal during transmission. Carrier Sense signal is monitored 
from the end of Preamble transmission until the end of the Frame Check Sequence 
for TONOCRS = 1 (Transmit On No Carrier Sense mode) it Indicates that transmis- 
sion has been executed despite a lack of CRS. For TONOCRS = 0 (Ethernet 
mode), this bit also indicates unsuccessful transmission (transmission stopped 
when lack of Carrier Sense has been detected). 

— Transmission unsuccessful (stopped) due to Loss of CTS. 

— Transmission unsuccessful (stopped) due to DMA Underrun; i.e., the system did 
not supply data for transmission. 

— Transmission Deferred, i.e., transmission was not immediate due to previous link 
activity. 


— Heartbeat Indicator, Indicates that after a previously performed transmission, and 
before the most recently performed transmission, (Interframe Spacing) the CDT 
signal was monitored as active. This indicates that the Ethernet Transceiver Colli- 
sion Detect logic Is performing properly. The Heartbeat is monitored during the 
Interframe Spacing period. 

— Transmission attempt was stopped because the number of collisions exceeded the 
maximum allowable number of retries. 



— 0 (Reserved). 

— The number of Collisions experienced during this frame. Max Col = 0 plus S5 = 1 
indicates 16 collisions. 


— As per standard Command Block (see the NOP Command for details) 

— In the 82586 and 32-bit Segmented modes this Is the offset of the first Tx Buffer 
Descriptor containing the data to be transmitted. In the Linear mode this is the 32- 
bit address of the first Tx Buffer Descriptor on the list. If the TBD POINTER is all Is 
It indicates that no TBD is used. 

— Contains the Destination Address of the frame. The least significant bit (MC) indi- 
cates the address type. 

MC = 0: Individual Address. 


MC = 1 : Multicast or Broadcast Address. 

If the Destination Address bits are all Is this is a Broadcast Address. 

— The contents of this 2-byte field are user defined. In 802.3 it contains the length of 
the data field. It is placed in memory in the same order it is transmitted; i.e., most 
significant byte first, least significant byte second. 

— This 14-blt counter indicates the number of bytes that will be transmitted from the 
Transmit Command Block, starting from the third byte after the TCB COUNT field 
(address n+^2 in the 32-bit Segmented mode, /V+16 In the Linear mode). The 
TCB COUNT field can be any number of bytes (including an odd byte), this allows 
the user to transmit a frame with a header having an odd number of bytes. The 
TCB COUNT field is not used in the 82586 mode. 

— Indicates that the whole frame is kept in the Transmit Command Block. In the 
Simplified memory model it must be always asserted. 
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The interpretation of what is transmitted depends on the No Source Address insertion configuration bit and the 
memory model being used. 

NOTES: 

1 . The Destination Address and the Length Field are sequential. The Length Field immediately follows the 
most significant byte of the Destination Address. 

2. In case the 82596 is configured with No Source Address insertion bit equal to 0, the 82596 inserts Its 
configured Source Address in the transmitted frame. 

• In the 82586 mode, or when the Simplified memory model Is used, the Destination and Length fields of the 
transmitted frame are taken from the Transmit Command Block. 

• If the FLEXIBLE memory model is used, the Destination and Length fields of the transmitted frame can be 
found either in the TCB or TBD, depending on the TCB COUNT. . 

3. If the 82596 is configured with the Address/ Length Field Location equal to 1, the 82596 does not insert Its 
configured Source Address in the transmitted frame. The first (2 x Address Length) + 2 bytes of the 
transmitted frame are interpreted as Destination Address, Source Address, and Length fields respectively. 
The location of the first transmitted byte depends on the operational mode of the 82596: 

• In the 82586 mode, it is always the first byte of the first Tx Buffer. 

• In both the 32-bit Segmented and Linear modes it depends on the SF bit and TCB COUNT; 

— In the Simplified memory mode the first transmitted byte is always the third byte after the TCB COUNT 
field. 

— In the Flexible mode, if the TCB COUNT is greater than 0 then it Is the third byte after the TCB COUNT 
field. If TCB COUNT equals 0 then It is first byte of the first Tx Buffer. 

• Transmit frames shorter than six bytes are invalid. The transmission will be aborted (only in 82586 mode) 
because of a DMA Underrun. 

4. Frames which are aborted during transmission are jammed. Such an interruption of transmission can be 
caused by any reason indicated by any of the status bits 8,9,10 and 1 2. 


Jamming Rules 

1. Jamming will not start before completion of preamble transmission. 

2. Collisions detected during transmission of the last 1 1 bits will not result In jamming. 

The format of a Transmit Buffer Descriptor Is: 

82586 Mode 


31 

ODD WORD 

16 15 


13 

EVEN WORD 

0 

NEXT TBD OFFSET 

EOF 

[Z 

SIZE (ACT COUNT) | 

XXX 

X X X X X 

TRANSMIT BUFFER ADDRESS 




32-Bit Segmented Mode 



31 

ODD WORD 

16 15 


13 

EVEN WORD 

0 

NEXT TBD OFFSET 

EOF 

0 

SIZE (ACT COUNT) 

TRANSMIT BUFFER ADDRESS 




Linear Mode 



31 

ODD WORD 

16 15 


13 

EVEN WORD 

0 

0 0 0 

0 0 0 0 0 

0 0 0 

0 0 0 0 0 

EOF 

0 

SIZE (ACT COUNT) 

NEXT TBD ADDRESS 

TRANSMIT BUFFER ADDRESS 


Figure 31 
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where: 


EOF — This bit indicates that this TBD is the last one associated with the frame being 

, transmitted. It Is set by the CPU before transmit. 

SIZE (ACT COUNT) — This 14-blt quantity specifies the number of bytes that hold information for the 
current buffer. It is set by the CPU before transmission. 


NEXT TBD ADDRESS — In the 82586 and 32-bit Segmented modes, It is the offset of the next TBD on the 
list. In the Linear mode this Is the 32-bit address of the next TBD on the list. It is 
meaningless If EOF=1. 

BUFFER ADDRESS — The starting address of the memory area that contains the data to be sent. In the 
82586 mode, this is a 24-bit address (A31 -A24 are considered to be zero). In the 
32-bit Segmented and Linear modes this is a 32-bit address. This buffer can be 
byte aligned for the 82596 B step. 


TDR 

This operation activates Time Domain Reflectomet, which Is a mechanism to detect open or short circuits on 
the link and their distance from the diagnosing station. The TDR command has no parameters. The TDR 
transmit sequence was changed, compared to the 82586, to form a regular transmission. The TDR bit stream 
is as follows. 

— Preamble 

— Source address 

— Another Source address (the TDR frame Is transmitted back to the sending station, 
so DEST ADR = SRC ADR). 

— Data field containing 7Eh patterns. 

— Jam Pattern, which is the inverse CRC of the transmitted frame. 

Maximum length of the TDR frame is 2048 bits. If the 82596 senses collision while transmitting the TDR frame 
it transmits the jam pattern and stops the transmission. The 82596 then triggers an internal timer (STC); the 
timer is reset at the beginning of transmission and reset If CRS Is returned. The timer measures the time 
elapsed from the start of transmission until an echo Is returned. The echo is indicated by Collision Detect going 
active or a drop in the Carrier Sense signal. The following table lists the possible cases that the 82596 is able 
to analyze. 


Conditions of TDR as Interpreted by the 82596 


Transceiver Type 

Condition 

Ethernet 

Non Ethernet 

Carrier Sense was inactive for 2048-bit-time 
periods 

Short or Open on the 
Transceiver Cable 

NA 

Carrier Sense signal dropped 

Short on the Ethernet cable 

NA 

Collision Detect went active 

Open on the Ethernet cable 

Open on the Serial Link 

The Carrier Sense Signal did not drop or the 
Collision Detect did not go active within 

2048-bit time period 

No Problem 

No Problem 


An Ethernet transceiver Is defined as one that returns transmitted data on the receive pair and activates the 
Carrier Sense Signal while transmitting. A Non-Ethernet Transceiver is defined as one that does not do so. 
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The format of the Time Domain Reflectometer command is: 


82586 and 32-Bit Segmented Modes 

31 ODD WORD 16 15 EVEN WORD 0 


EL 

S 

1 

X xxxxxxxxx 

1 0 1 

C 

B 

OK 

0000000000000 

LNK 

OK 

XVR 

PRB 

ET 

OPN 

ET 

SRT 

X 

TIME 
(11 bits) 

A15 LINK OFFSET AO 


Linear Mode 


31 


ODD WORD 

16 15 



EVEN WORD 

0 


S 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

1 0 1 

C 

0 

OK 

00000 0 0000 

0 0 0 


A31 LINK ADDRESS AO 


0000000000000000 

LNK 

XVR 

ET 

ET 

X 

TIME 


OK 

PRB 

OPN 

SRT 


(1 1 bits) 


Figure 32. TDR 


where: 

LINK ADDRESS, 
EL. B, C. I, S 

A 


Bits 19-28 
CMD (Bits 16-18) 
TIME 


LNKOK (Bit 15) 
XCVR PRB(Bit14) 

ETOPN(Bit13) 

ET SRT (Bit 12) 


— As per standard Command Block (see the NOP command for details). 

— Indicates that the command was abnormally terminated due to CU Abort control 
command. If one, then the command was aborted, and if necessary it should be 
repeated. If this bit is zero, the command was not aborted. 

— Reserved (0 in the 32-bit Segmented and Linear Modes). 

— The TDR command. Value: 5h. 

— An 1 1-blt field that specifies the number of TxC cycles that elapsed before an echo 
was observed. No echo is indicated by a reception consisting of “Is” only. Be- 
cause the network contains various elements such as transceiver links, transceiv- 
ers, Ethernet, repeaters etc., the TIME is not exactly proportional to the problems 
distance. 

— No link problem Identified. TIME = 7FFh. 

— Indicates a Transceiver problem. Carrier Sense was inactive for 2048-bit time peri- 
od. LNK OK = 0. TIME = 7FFh. 

— The transmission line is not properly terminated. Collision Detect went active and 
LNKOK = 0. 

— There Is a short circuit on the transmission line. Carrier Sense Signal dropped and 
LNKOK = 0. 
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DUMP 

This command causes the contents of various 82596 registers to be placed in a memory area specified by the 
user. It is supplied as a 82596 self-diagnostic tool, and to provide registers of interest to the user. The format 
of the DUMP command is: 


82586 and 32-Bit Segmented Modes 

31 ODD WORD 16 15 EVEN WORD 0 


EL 

S 

0 

xxxxxxxxxx 

1 1 0 

C 

B 

OK 

0 0 

0 0 0 0 0 0 

0 0 0 0 0 

A15 

BUFFER OFFSET 

AO 

A15 



LINK OFFSET 

AO 


Linear Mode 


31 


ODD WORD 

16 15 



EVEN WORD 0 

ii 



XXXXXXXXXX 

1 10 

0 

III 

BH 

0000000000000 

A31 


LINK ADDRESS 

AO 

A31 


BUFFER ADDRESS 

AO 


Figure 33. Dump 


where: 

LINK ADDRESS. 
EL, B, C, I, S 

OK 

Bits 19-28 
CMD (Bits 16-18) 
BUFFER POINTER 


— As per standard Command Block (see the NOP command for details). 


— Indicates error free completion. 

— Reserved (0 in the 32-bit Segmented and Linear Modes). 

— The Dump command. Value: 6h. 



— In the 82586 and 32-bit Segmented modes this is the 16-bit-offset portion of the 
dump area address. In the Linear mode this is the 32-bit linear address of the dump 
area. 


Dump Area Information Format 

• The 82596 is not Dump compatible with the 82586 because of the 32-bit internal architecture. In 82586 
mode the 82596 will dump the same number of bytes as the 82586. The compatible data will be marked 
with an asterisk. 

• In 82586 mode the dump area is 170 bytes. 

• The DUMP area format of the 32-bit Segmented and Linear modes is described in Figure 35. 

• The size of the dump area of the 32-bit Segmented and Linear modes is 304 bytes. 

• When the Dump is executed by the Port command an extra word will be appended to the Dump Area. The 
extra word is a copy of the Dump Area status word (containing the C, B, and OK Bits). The C and OK Bits 
are set when the 82596 has completed the Port Dump command. 
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15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 



DMA CONTROL REGISTER 

00 


CONFIGURE BYTES* 3, 2 

02 *The 82596 is not Dump compatible with 


CONFIGURE BYTES* 5, 4 

04 the 82586 because of thO 32-bit internal ar- 



chitecture. In 82586 mode the 82596 will 


CONFIGURE BYTES* 7, 6 

dump the same number of bytes as the 


CONFIGURE BYTES* 9, 8 

08 82586. 


CONFIGURE BYTES* 10 

OA **These bytes are not user defined, results 


I.A. BYTES 1 , 0* 

QQ may vary from Dump command to Dump 



command. 


lA BYTES 3, 2* 

OE 


lA BYTES 5, 4* 

10 


LASTT.X. STATUS* 

12 


TX CRC BYTES 1.0* 

14 


T.X. CRC BYTES 3. 2* 

16 


RX CRC BYTES 1,0* 

18 


R.X. CRC BYTES 3. 2* 

1A 


RX TEMP MEMORY 1,0* 

1C 


R.X. TEMP MEMORY 3, 2* 

IE 


R.X. TEMP MEMORY 5. 4* 

20 


LAST RECEIVED STATUS* 

22 


HASH REGISTER BYTES 1.0* 

24 


HASH REGISTER BYTES 3, 2* 

26 


HASH REGISTER BYTES 5, 4* 

28 


HASH REGISTER BYTES 7, 6* 

2A 


SLOT TIME COUNTER* 

2C 


WAIT TIME COUNTER* 

2E 


MICRO MACHINE** 

30 


REGISTER FILE 



60 BYTES 

6A 


MICRO MACHINE LFSR** . 

6C 


MICRO MACHINE** 

6E 


FLAG ARRAY 



14 BYTES 

7A 


QUEUE MEMORY** 

7C 


CU PORT 



8 BYTES 

82 


MICRO MACHINE ALU** 

84 


RESERVED** 

86 


M.M. TEMP A ROTATE R** 

88 


M.M.TEMPA** 

8A 


T.X. DMA B'lTE COUNT** 

8C 


M.M. INPUT PORT ADDRESS** 

8E 


T.X. DMA ADDRESS 

90 


M.M. OUTPUT PORT** 

92 


R.X. DMA BYTE COUNT** 

94 


M.M. OUTPUT PORT ADDRESS REGISTER** 

96 


R. DMA ADDRESS** 

98 


RESERVED** 

9A 


BUS THROTTLE TIMERS 

9C , 


DIU CONTROL REGISTER** 

9E 


RESERVED** 

AO 


DMA CONTROL REGISTER** 

A2 


BIU CONTROL REGISTER** 

A4 


. M.M. DISPATCHER REG.** 

A6 


M.M. STATUS REGISTER** 

A8 


Figure 34. Dump Area Format— 82586 Mode 
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31 

0 



CONFIGURE BYTES 5. 4. 3, 2 

00 The 82596 is not Dump compatible with the 




82586 because of the 32-bit internal archi- 


CONFIGURE BYTES 9. 8. 7. 6 

04 tecture. In 82586 mode the 82596 will dump 


CONFIGURE BYTES 13. 12. 11. 10 

08 the same number of bytes as the 82586. 




**These bytes are not user defined, results 


I.A. BYTES 1.0 

XXXXXXXX 

OC may vary from Dump command to Dump 


I.A. BYTES 5, 2 

.jQ command. 


TX CRC BYTES 0, 1 

LAST T.X. STATUS 

14 


RX CRC BYTES 0, 1 

TX CRC BYTES 3, 2 

18 


RXTEMP MEMORY 1,0 

RX CRC BYTES 3, 2 

1C 


R.X. TEMP MEMORY 5. 2 

20 


HASH REGISTERS 1,0 

LAST R.X. STATUS 

24 


HASH REGISTER BYTES 5, 2 

28 


SLOT TIME COUNTER 

HASH REGISTERS 7, 6 

2C 


RECEIVE FRAME LENGTH 

WAIT-TIME COUNTER 

30 


MICRO MACHINE** 

34 


REGISTER FILE 



128 BYTES 

BO 


MICRO MACHINE LFSR** 

B4 


MICRO MACHINE** 

B8 


FLAG ARRAY 



28 BYTES 

DO 


M.M. INPUT PORT** 

D4 


16 BYTES 

EO 


MICRO MACHINE ALU** 

E4 


RESERVED** 

E8 


M.M. TEMP A ROTATE R.** 

EC 


M.M. TEMP A** 

FO 


T.X. DMA BYTE COUNT** 

F4 


M.M. INPUT PORT ADDRESS REGISTER** 

F8 


T.X. DMA ADDRESS** 

FC 


M.M. OUTPUT PORT REGISTER** 

100 


R.X. DMA BYTE COUNT** 

104 


M.M. OUTPUT PORT ADDRESS REGISTER** 

108 


R.X. DMA ADDRESS REGISTER** 

IOC 


RESERVED** 

110 


BUS THROTTLE TIMERS 

114 


DIU CONTROL REGISTER** 

118 


RESERVED** 

lie 


DMA CONTROL REGISTER** 

120 


BIU CONTROL REGISTER** 

124 


M.M. DISPATCHER REG.** 

128 


M.M. STATUS REGISTER** 

12C 


Figure 35. Dump Area Format— Linear and 32-Bit Segmented Mode 
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Diagnose 

The Diagnose Command triggers an internal self-test procedure that checks internal 82596 hardware, which 
includes; 

• Exponential Backoff Random Number Generator (Linear Feedback Shift Register). 

• Exponential Backoff Timeout Counter. 

• Slot Time Period Counter. 

• Collision Number Counter. 

• Exponential Backoff Shift Register. 

• Exponential Backoff Mask Logic. 

• Timer Trigger Logic. 

This procedure checks the operation of the Backoff block, which resides in the serial side and is not easily 
controlled. The Diagnose command is performed in two phases. 


The format of the 82596 Diagnose command is: 



Figure 36. Diagnose 


where: 

LINK ADDRESS, 
EL. B, C, I, S 

Bits 19-28 
CMD (bits 16-18) 
OK (bit 13) 

F (bit 11) 


— As per standard Command Block (see the NOP command for details). 

— Reserved (0 in the 32-bit Segmented and Linear Modes). 

— The Diagnose command. Value: 7h. 

— Indicates error free completion. 

— Indicates that the self-test procedure has failed. 
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RECEIVE FRAME DESCRIPTOR 

Each received frame is described by one Receive Frame Descriptor (see Figure 37). Two new memory 
structures are available for the received frames. The structures are available only in the Linear and 32-bit 
Segmented modes. 


simplified Memory Structure 

The first is the Simplified memory structure, the data section of the received frame is part of the RFD and is 
located immediately after the Length Field. Receive Buffer Descriptors are not used with the Simplified struc- 
ture, it is primarily used to make programming easier. If the length of the data area described in the Size Field 
is smaller than the incoming frame, the following happens. 

1 . The received frame is truncated. 

2. The No Resource error counter is updated. 

3. If the 82596 is configured to Save Bad Frames the RFD is not reused; otherwise, the same RFD is used to 
hold the next received frame, and the only action taken regarding the truncated frame is to update the 
counter. 

4. The 82596 continues to receive the next frame in the next RFD. 



Figure 37. The Receive Frame Area 
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Note that this sequence is very useful for monitoring. If the 82596 is configured to Save Bad Frames, to 
receive in Promiscuous mode, and to use the Simplified memory structure, any programmed length of received 
data can be saved in memory. 

The Simplified memory structure is shown in Figure 38. 


SCB 



290218-16 


Figure 38. RFA Simplified Memory Structure 


Flexible Memory Structure 

The second structure is the Flexible memory structure, the data structure of the received frame is stored in 
both the RFD and in a linked list of Receive Buffers — Receive Buffer Descriptors. The received frame is placed 
in the RFD as configured in the Size field. Any remaining data is placed in a linked list of RBDs. 

The Flexible memory structure is shown in Figure 39. 
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Buffers on the receive side can be different lengths. The 82596 will not place more bytes into a buffer than 
indicated in the associated RBD. The 82596 will fetch the next RBD before it is needed. The 82596 will 
attempt to receive frames as long as the FBL is not exhausted. If there are no more buffers, the 82596 
Receive Unit will enter the No Resources state. Before starting the RU, the CPU must place the FBL pointer in 
the RBD pointer field of the first RFD. All remaining RBD pointer fields for subsequent RFDs should be “1s.” If 
the Receive Frame Descriptor and the associated Receive Buffers are not reused (e.g., the frame is properly 
received or the 82596 is configured to Save Bad Frames), the 82596 writes the address of the next free RBD 
to the RBD pointer field of the next RFD. 


Receive Buffer Descriptor (RBD) 

The RBDs are used to store received data in a flexible set of linked buffers. The portion of the frame’s data 
field that is outside the RFD is placed in a set of buffers chained by a sequence of RBDs. The RFD points to 
the first RBD, and the last RBD is flagged with an EOF bit set to 1 . Each buffer in the linked list of buffers 
related to a particular frame can be any size up to 214 bytes but must be word aligned (begin on an even 
numbered byte). This ensures optimum use of the memory resources while maintaining low overhead. All 
buffers in a frame are filled with the received data except for the last, in which the actual count can be smaller 
than the allocated buffer space. 
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31 ODD WORD 16 15 EVEN WORD 0 


ELSXXXXXXXXXXXXXX 

C B OK 0 STATUS BITS 0 0 0 0 0 0 

A15 RBD OFFSET AO 

A15 LINK OFFSET AO 

4th byte 

DESTINATION ADDRESS 1 st byte 

SOURCE ADDRESS 1 st byte 

6th byte 

6th byte 

4th byte 

XXXXXXXXXXXXXXXX 

LENGTH FIELD 


Figure 40. Receive Frame Descriptor — 82586 Mode 


31 ODD WORD 16 15 EVEN WORD 0 


m 

m 

0000000000 SF 000 

c 

B 

OK STATUS BITS 

A15 RBD OFFSET AO 

A15 LINK OFFSET Ao| 

m 

□ 

SIZE 

EOF 

F 

ACTUAL COUNT 

4th byte 

DESTINATION ADDRESS 1 st byte 

SOURCE ADDRESS 1st byte 

6th byte 

6th byte 

4th byte 


LENGTH FIELD 


OPTIONAL DATA AREA 


Figure 41. Receive Frame Descriptor — 32-Bit Segmented Mode 


31 ODD WORD 16 15 EVEN WORD 0 


EL 

S 

0000000000 SF 000 

C 

B 

OK STATUS BITS 

A31 LINK ADDRESS AO 

A31 RECEIVE BUFFER DESCRIPTOR ADDRESS AO 

0 


SIZE 

EOF 

F 

ACTUAL COUNT 

4th byte 

DESTINATION ADDRESS 1 st byte 

SOURCE ADDRESS 1st byte 

6th byte 

6th byte 

4th byte 


LENGTH FIELD 

OPTIONAL DATA AREA 


Figure 42. Receive Frame Descriptor — Linear Mode 
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where: 

EL 

S 

SF 


C 

B 


OK (bit 13) 

STATUS 


LINK ADDRESS 

RBD POINTER 

EOF 

F 

SIZE 

ACT COUNT 
MC 

DESTINATION 

ADDRESS 

SOURCE ADDRESS 
LENGTH FIELD 


— When set, this bit indicates that this RFD is the last one on the RDL. 

— When set, this bit suspends the RU after receiving the frame. 

— This bit selects between the Simplified or the Flexible mode. 

0 — Simplified mode, all the RX data is in the RFD. RBD ADDRESS field is all 
“Is.” 


1 — Flexible mode. Data is in the RFD and in a linked list of Receive Buffer De- 
scriptors. 

— This bit indicates the completion of frame reception. It Is set by the 82596. 

— This bit indicates that the 82596 Is currently receiving this frame, or that the 82596 
is ready to receive the frame. It Is initially set to 0 by the CPU. The 82596 sets it to 
1 when reception set up begins, and to 0 upon completion. The C and B bits are 
set during the same operation. 

— Frame received successfully, without errors. RFDs with bit 13 equal to 0 are possi- 
ble only if the save bad frames, configuration option is selected. Otherwise all 
frames with errors will be discarded, although statistics will be collected on them. 

— The results of the Receive operation. Defined bits are. 


Bit 12 
Bit 11 
Bit 10 
Bit 9: 
Bit 8: 
Bit 7: 
Bit 6: 
Bit 5: 


Length error if configured to check length 

CRC error in an aligned frame 

Alignment error (CRC error in misaligned frame) 

Ran out of buffer space — no resources 
DMA Overrun failure to acquire the system bus. 

Frame too short. 

No EOP flag (for Bit stuffing only) 

When the SF bit equals zero, and the 82596 is configured to save bad 
frames, this bit signals that the receive frame was truncated. Otherwise It 
is zero. 



Bits 2-4: Zeros 


Bit 1 : When it is zero, the destination address of the received frame matches 

the lA address. When It is a 1 , the destination address of the received 
frame did not match the individual address. For example, a multicast 
address or broadcast address will set this bit to a 1 . 


Bit 0: Receive collision, a collision is detected during reception. 

— A 1 6-bit offset (32-blt address in the Linear mode) to the next Receive Frame 
Descriptor. The Link Address of the last frame can be used to form a cyclical list. 

— The offset (address in the Linear mode) of the first RBD containing the received 
frame data. An RBD pointer of all ones indicates no RBD. 

— These fields are for the Simplified and Flexible memory models. They are exactly 
the same as the respective fields in the Receive Buffer Descriptor. See the next 
section for detailed explanation of their functions. 


— Multicast bit. 

— The contents of the destination address of the receive frame. The field is 0 to 6 
bytes long. 

— The contents of the Source Address field of the received frame. It Is 0 to 6 bytes 
long. 

— The contents of this 2-byte field are user defined. In 802.3 it contains the length of 
the data field. It is placed in memory in the same order it is received, i.e., most 
significant byte first, least significant byte second. 
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NOTES 

1. The Destination address, Source address and Length fields are packed, i.e., one field immediately follows 
the next. 

2. The affect of Address/ Length Location (No Source Address Insertion) configuration parameter while re- 
ceiving is as follows: 

— 82586 Mode: The Destination address, Source address and Length field are not used, they are placed In 
the RX data buffers. 

— 32-Bit Segmented and Linear Modes: when the Simplified memory model is used, the Destination address, 
Source address and Length fields reside in their respective fields in the RFD. When the Flexible memory 
strucrture is used the Destination address. Source address, and Length field locations depend on the SIZE 
field of the RFD. They can be placed in the RFD, In the RX data buffers, or partially In the RFD and the rest 
in the RX data buffers, depending on the SIZE field value. 


82586 Mode 


31 





ODD WORD 


16 15 



EVEN WORD 

0 


A15 




NEXT RBD OFFSET 


AO 

EOF 


ACTUAL COUNT 

0 

X X 

X 

X 

X 

X 

X X 

A23 



RECEIVE BUFFER ADDRESS 

AO 

4 

X X 

X 

X 

X 

X 

X X 

XXX 

X 

X X X X 

EL 

X 

SIZE 

8 

31 





ODD WORD 


32-Bit Segmented Mode 

16 15 

EVEN WORD 

0 


A15 




NEXT RBD OFFSET 


AO 

EOF 


ACTUAL COUNT 

0 

A31 








RECEIVE BUFFER ADDRESS 


AO 

4 

0 0 

0 

0 

0 

0 

0 0 

0 0 0 

0 

0 0 0 0 

EL 

P 

SIZE 

8 

31 





ODD WORD 


Linear Mode 

16 15 

EVEN WORD 

0 


0 0 

0 

0 

0 

0 

0 0 

0 0 0 

0 

0 0 0 0 

EOF 


ACTUAL COUNT 

0 

A31 








NEXT RBD ADDRESS 


AO 

4 

A31 








RECEIVE BUFFER ADDRESS 


AO 

8 

0 0 

0 

0 

0 

0 

0 0 

0 0 0 

0 

0 0 0 0 

EL 

P 

SIZE 



Figure 43. Receive Buffer Descriptor 
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where: 

EOF 

F 


ACT COUNT 


NEXT BD ADDRESS 
BUFFER ADDRESS 

EL 

P 


SIZE 


— Indicates that this is the last buffer related to the frame. It is cleared by the CPU 
before starting the RU, and is written by the 82596 at the end of reception of the 
frame. 

— Indicates that this buffer has already been used. The Actual Count has no meaning 
unless the F bit equals one. This bit is cleared by the CPU before starting the RU, 
and is set by the 82596 after the associated buffer has been. This bit has the same 
meaning as the Complete bit in the RFD and CB. 

— This 14-bit quantity indicates the number of meaningful bytes In the buffer. It is 
cleared by the CPU before starting the RU, and is written by the 82596 after the 
associated buffer has already been used. In general, after the buffer is full, the 
Actual Count value equals the size field of the same buffer. For the last buffer of 
the frame. Actual Count can be less than the buffer size. 


— The offset (absolute address In the Linear mode) of the next RBD on the list. It Is 
meaningless if EL = 1. 

— The starting address of the memory area that contains the received data. In the 
82586 mode, this is a 24-bit address (with pins A24-A31 =0). In the 32-bit Seg- 
mented and Linear modes this is a 32-bit address. 

— Indicates that the buffer associated with this RBD Is last In the FBL. 


— This bit indicates that the 82596 has already prefetched the RBDs and any change 
in the RBD data will be ignored. This bit is valid only in the new 82596 memory 
modes, and If this feature has been enabled during configure command. The 
82596 Prefetches the RBDs in locked cycles; after prefetching the RBD the 82596 
performs a write cycle where the P bit is set to one and the rest of the data remains 
unchanged. The CPU is responsible for resetting it in all RBDs. The 82596 will not 
check this bit before setting it. 

— This 14-bit quantity Indicates the size, in bytes, of the associated buffer. This quan- 
tity must be an even number. 
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PGA PACKAGE THERMAL SPECIFICATION 


Parameter 

Thermal Resistance 

^JC 

3‘’C/W 

^JA 

24“C/W 




ELECTRICAL AND TIMING 
CHARACTERISTICS 

Absolute Maximum Ratings 

• storage Temperature -65®C to + 150®C 

• Case Temperature under Bias to + 1 10®C 

• Supply Voltage 

with Respect to Vss “ 0-5V to + 6.5V 

• Voltage on Other Pins -0.5V to Vcc + 0-5V 


DC Characteristics 

Tc = 0°C-85°C, Vcc = 5V ±10% LE/BE have MOS levels (see Vmiu Vmih)- 
All other signals have TTL levels (see V|l, V|h, Vql. Vqh)- 


Symbol 

Parameter 

Min 

Max 

Units 

Notes 

V|L 


-0.3 

+ 0.8 



V|H 


2.0 





Input Low Voltage (MOS) 

-0.3 

+ 0.8 

V 


Vmih 

Input High Voltage (MOS) 

3.7 

Vcc + 0.3 

V 


VoL 

Output Low Voltage (TTL) 


0.45 

V 

Iql = 4.0 mA 

VciL 

RXC, TXC Input Low Voltage 

-0.5 

0.6 

V 


VciH 

RXC, TXC Input High Voltage 

3.3 

Vcc +0.5 

V 


VoH 

Output High Voltage (TTL) 

2.4 


V 

Iqh == 0.9 mA~1 mA 

Ili 

Input Leakage Current 


±15 

jliA 

0 ^ V|N ^ Vcc 

•lo 

Output Leakage Current 


±15 

/llA 

0.45 < VouT Vcc 

C|N 

Capacitance of Input Buffer 


10 

PF 

FC = 1 MHz 

COUT 

Capacitance of Input/Output 
Buffer 


12 

PF 

FC = 1 MHz 

CcLK 

CLK Capacitance 


20 

pF 

FC = 1 MHz 

•cc 

Power Supply 


200 

mA 

At 25 MHz 

Ice Typical = 100 mA 

•cc 

Power Supply 


300 

mA 

At 33 MHz 

Ice Typical = 150 m A 
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AC Characteristics 

82596CA INPUT/OUTPUT SYSTEM TIMINGS 

Tc = 0°C-85°C, Vcc ^ 5V ±10%. These timing assume the Cl on all outputs Is 50 pF unless otherwise 
specified. Cl can be 20 pF to 120 pF however timings must be derated. All timing requirements are given in 
nanoseconds. 


Symbol 

Parameter 

25 MHz 

Notes 

Min 

Max 


Operating Frequency 

12.5 MHz 

25 MHz 

IX CLK Input 

T1 

CLK Period 

40 

80 


Tia 

CLK Period Stability 


0.1% 

Adjacent CLK A 

T2 

CLK High 

14 


2.0V 

T3 

CLK Low 

14 


0.8V 

T4 

CLK Rise Time 


4 

0.8V to 2.0V 

T5 

CLK Fall Time 


4 

2.0V to 0.8V 

T6 

BEn, LOCK, and A2-A31 Valid Delay 

3 

22 


T6a 

BLAST, PCHK Valid Delay 

3 

27 


T7 

BEn, LOCK, BLAST, A2-A31 Float Delay 

3 

30 


T8 

W/R and ADS Valid Delay 

3 

22 


T9 

W/R and ADS Float Delay 

3 

30 


T10 

D0-D31 , DPn Write Data Valid Delay 

3 

22 


Til 

D0-D31 , DPn Write Data Float Delay 

3 

30 


T12 

HOLD Valid Delay 

3 

22 


T13 

CA and BREQ Setup Time 

7 


1,2 

T14 

CA and BREQ Hold Time 

3 


1,2 

T15 

BS1 6 Setup Time 

8 


2 

T16 

BS16 Hold Time 

3 


2 

T17 

BRDY, RDY Setup Time 

8 


2 

T18 

BRDY, RDY Hold Time 

3 


2 

T19 

D0-D31 , DPn READ Setup Time 

5 


2 

T20 

D0-D31 , DPn READ Hold Time 

3 


2 

T21 

AHOLD and HLDA Setup Time 

10 


1,2 

T22 

AHOLD Hold Time 

3 


1.2 

T22a 

HLDA Hold Time 

3 


1,2 

T23 

RESET Setup Time 

10 


1.2 

T24 

RESET Hold Time 

3 


1.2 

T25 

INT/TnT Valid Delay 

1 

26 


T26 

CA and BREQ, PORT Pulse Width 

2T1 


1.2,3 

T27 

D0-D31 CPU PORT Access Setup Time 

5 


2 

T28 

D0-D31 CPU PORT Access Hold Time 

3 


2 

T29 

PORT Setup Time 

7 


2 

T30 

PORT Hold Time 

3 


2 

T31 

BOFF Setup Time 

10 


2 

T32 

BOFF Hold Time 

3 


2 
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82596CA INPUT/OUTPUT SYSTEM TIMINGS 

Tc = 0®C-85®C, Vcc = 5V ±5%. These timing assume the Cl on all outputs is 50 pF unless otherwise 
specified. Cl can be 20 pF to 120 pF, however timings must be derated. All timing requirements are given In 
nanoseconds. 


Symbol 

Parameter 

33 MHz 

Notes 

Min 

Max 


1 

Operating Frequency 

12.5 MHz 

33 MHz 

IX CLK Input 

T1 

CLK Period 

30 

80 


Tia 

CLK Period Stability 


0.1% 

Adjacent CLK A 

T2 

CLK High 

11 


2.0V 

T3 

CLK Low 

11 


0.8V 

T4 

CLK Rise Time 


3 

0.8V to 2.0V 

T5 

CLK Fall Time 


3 


T6 

BEn, LOCK, and A2-A31 Valid Delay 

3 

19 



BLAST, PCHK Valid Delay 

3 

22 


T7 

BEn, LOCK, BLAST, A2-‘A31 Float Delay 

3 

20 


T8 

W/R and ADS Valid Delay 

3 

19 


T9 

W/R and ADS Float Delay 

3 

20 


T10 

D0-D31 , DPn Write Data Valid Delay 

3 

19 


T11 

D0-D31 , DPn Write Data Float Delay 

3 

20 


T12 

HOLD Valid Delay 

3 

19 


T13 

CA and BREQ Setup Time 

7 


1,2 

l2QHi 

CA and BREQ Hold Time 

3 


1.2 

mam 

BS16Setup Time 

6 


2 

wsma 

BS16 Hold Time 

3 


2 

T17 

BRDY, RDY Setup Time 

6 


2 

T18 

BRDY, RDY Hold Time 

3 


2 


D0~D31 , DPn READ Setup Time 

5 


2 

nn^iiiiin 

D0-D31, DPn READ Hold Time 

3 


2 


AHOLD and HLDA Setup Time 

8 


1,2 

T22 

AHOLD Hold Time 

3 


1,2 
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AC Characteristics (Continued) 

S2596CA INPUT/OUTPUT SYSTEM TIMINGS 


Cl on all outputs is 50 pF unless otherwise specified. 
All timing requirements are given in nanoseconds. 


Symbol 

Parameter 

33 MHz 

Notes 

Min 

Max 

T22a 

HLDA Hold Time 

. 3 


1.2 


RESET Setup Time 

8 


1.2 

T24 


3 


1,2 

T25 


1 

20 



CA and BREQ, PORT Pulse Width 

iiQim 


1,2,3 

T27 

D0-D31 CPU PORT Access Setup Time 

5 


2 


D0-D31 CPU PORT Access Hold Time 

3 


2 


PORT Setup Time 

7 


2 


PORT Hold Time 

3 


2 

T31 

BOFF Setup Time 

8 


2 

T32 

BOFF Hold Time 

3 


2 


NOTES: 

1. RESET, HLDA, and CA are internally synchronized. This timing is to guarantee recognition at next clock for RESET, HLDA 
and CA. 

2. All set-up, hold and delay timings are at maximum frequency specification Fmax, and must be derated according to the 
following equation for operation at lower frequencies: 

Tderated = (Fmax/Fopr) x T 
where: 

Tderate = Specifies the value to derate the specification. 

Fmax = Maximum operating frequency. 

Fopr = Actual operating frequency. 

T = Specification at maximum frequency. 

This calculation only provides a rough estimate for derating the frequency. For more detailed information, contact your 
Intel Sales Office for the data sheet supplement. 

3. CA pulse width need only be 1 T1 wide if the set up and hold times are met; BREQ must meet setup and hold times and 
need only be 1 T1 wide. 


TRANSMIT/RECEIVE CLOCK PARAMETERS 


Symbol 

Parameter 

20 MHz 

Notes 

Min 

Max 

T36 

TxC Cycle 

50 


1,3 

T38 

TxC Rise Time 


5 

1 

T39 

TxC Fall Time 


5 

1 

T40 

TxC High Time 

19 


1,3 

T41 

TxC Low Time 

18 


1,3 

T42 

TxD Rise Time 


10 

4 

T43 

TxD Fall Time 


10 

4 

T44 

TxD Transition 

20 


2,4 

T45 

TxC Low to TxD Valid 


25 

4,6 

T46 

TxC Low to TxD Transition 


25 

2,4 

T47 

TxC High to TxD Transition 


25 

2,4 

T48 

TxC Low to TxD High (At End of Transition) 


25 

4 
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TRANSMIT/RECEIVE CLOCK PARAMETERS (Continued) 


Symbol 

Parameter 

20 MHz 

Notes 

Min 

Max 

RTS AND CTS PARAMETERS 

T49 

TxC Low to RTS Low, 

Time to Activate RTS 


25 

5 

T50 

CTS Low to TxC Low, CTS Setup Time 


20 


T51 

TxC Low to CTS Invalid, CTS Hold Time 

10 


7 

T52 

TxC Low to RTS High 


25 

5 

RECEIVE CLOCK PARAMETERS 

T53 

PiXC Cycle 

50 


1,3 

T54 

RXC Rise Time 


5 

1 

T55 

me Fall Time 


5 

1 

T56 

RXC High Time 

19 


1 

T57 

RXC Low Time 

18 


1 

RECEIVED DATA PARAMETERS 

T58 

RXD Setup Time 

20 


6 

T59 

RXD Hold Time 

10 


6 

T60 

RXD Rise Time 


10 


T61 

RXD Fall Time 


10 


CRSANDC 

DT PARAMETERS 

T62 

CDT Low to TXC HIGH 

External Collision Detect Setup Time 

20 



T63 

t 5(5 High to CdT Inactive, CDT Hold Time 

10 



T64 

CDT Low to Jam Start 



10 

T65 

^Low toTXCHigh, 

Carrier Sense Setup Time 

20 



T66 

TXC High to CRS Inactive, CRS Hold Time 
(Internal Collision Detect) 

10 



T67 

CRS High to Jamming Start, 



12 

T68 

Jamming Period 



11 

T69 

CRS High to RXC High, 

CRS Inactive Setup Time 

30 



T70 

^High to C^ High, 

CRS Inactive Hold Time 

10 
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TRANSMIT/RECEIVE CLOCK PARAMETERS (Continued) 


Symbol 

Parameter 

20 MHz 

Notes 

Min 

Max 

INTERFRAME SPACING PARAMETERS 

T71 

Interframe Delay 



9 

EXTERNAL LOOPBACK-PIN PARAMETERS 

T72 

TXC Low to LPBK Low 


T36 

4 

T73 

TXC Low to LPBK High 


T36 

4 


NOTES: 

1. Special MOS levels. Vcil = 0-9V and Vqih = 3.0V. 

2. Manchester only. 

3. Manchester. Needs 50% duty cycle. 

4. 1 TTL load + 50 pF. 

5. 1 TTL load + 100 pF. 

6. NRZ only. 

7. Abnormal end of transmission — CTS expires before RTS. 

8. Normal end to transmission. 

9. Programmable value: 

T71 = N|fs«T36 

where: N|fs = the IFS configuration value 
(if N|fs is less than 12 then N|fs is forced to 12). 

10. Programmable value: 

T64 = (NcpF*T36) + x*T36 

(If the collision occurs after the preamble) 

where:, 

Ncdf = the collision detect filter configuration value, 
and 

X = 12, 13, 14, or 15 

11. T68 = 32 *736 

1 2. Programmable value: 

T67 = (Ncsf®T36) + A'»T36 

where: Nqsf = the Carrier Sense Filter configuration 

value, and 

x = 12, 13, 14, or 15 

13. To guarantee recognition on the next clock. 


4-119 



iny. 


82596CA 




82596CA BUS OPERATION 

The following figures show the 82596CA basic bus cycle and basic burst cycle. 









SYSTEM INTERFACE A.C. TIMING CHARACTERISTICS 

The measurements should be done at: 

• Tc = 0®C-85°C, Vcc = 5V ±10%, C = 50 pF unless otherwise specified. 

• A.C. testing inputs are driven at 2.4V for a logic “1” and 0.45V for a logic “0”. 

• Timing measurements are made at 1.5V for both logic “1” and “0”. 

• Rise and Fall time of inputs and outputs signals are measured between 0.8V and 2.0V respectively unless 
otherwise specified. 

• All timings are relative to CLK crossing the 1 .5V level. 

• All A.C. parameters are valid only after 100 jxs from power up. 



Figure 46. CLK Timings 


Two types of timing specifications are presented below: 

1 . Input Timing — minimum setup and hold times. 

2. Output Timings — output delays and float times from CLK rising edge. 


Figure 47 defines how the measurements should be done: 






X 

^1.5V 




-Tn-^ 


y 

LEGEND: 

Ts = Input Setup Time 

Th = Input Hold Time 

Tn = Minimum output delay or Mininum float delay 

Tx = Maximum output delay or Maximum float delay 

290218-20 


Figure 47. Drive Leveis and Measurements Points for A.C. Specifications 


Ts = T13, T15, T17. T19, T21, T23, T27, T29, T31 
Th = T14, T16, T18, T20. T22, T22a, T24, T28, T30, T32 
Tn = T6, T6a, T7, T8, T9, T10, Til, T12, T25 
Tx = T6, T6a, T7, T8, T9, T10, Til, T12, T25 
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Figure 51. Input Setup and Hold Time 
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Figure 52. Output Valid Delay Timing 
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OUTLINE DIAGRAMS 


132 LEAD CERAMIC PIN GRID ARRAY PACKAGE INTEL TYPE A 



SEATING — J 
PLANE 


0B (ALL PINS) 
1 


;c 




SWAGGED 

PIN 

DETAIL 


mm (inch) 

290218-34 


Family: Ceramic Pin Grid Array Package 

Symbol 

Millimeters 

Inches 

Min 

Max 

Notes 

Min 

Max 

Notes 

A 

3.56 

4.57 


0.140 

0.180 


Ai 

0.76 

1.27 

Solid Lid 

0.030 

0.050 

Solid Lid 

A2 

2.67 

3.43 

Solid Lid 

0.105 

0.135 

Solid Lid 

As 

1.14 

1.40 


0.045 

0.055 


B 

0.43 

0.51 


0.017 

0.020 


D 
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Intel Case Outline Drawings 
Plastic Quad Fiat Pack (PQFP) 
0.025 Inch (0.635mm) Pitch 
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i 960 TM FAMILY OF SOFTWARE DEBUGGERS 



280916-1 


COMPREHENSIVE SOFTWARE DEBUG SUPPORT FOR i 960 TM 
EMBEDDED APPLICATIONS 

Intel provides comprehensive software debug support for all members of the i960™ 
component architecture, including the newest members, the i960SA and i960SB. All 
Intel’s i960 software debug products share the same high-level, windowed user interface 
emerging as the standard for all i960 tools from Intel. This innovative debug interface 
allows users to focus their efforts on finding bugs rather than spending time learning and 
manipulating the debug environment. 

Intel’s i960 software debug tools support a wide variety of debug environments, including 
code debug on a simulated target environment, a PC-based evaluation board, a serial- 
based Intel evaluation board, or a serial-based, customized target system. 


GENERAL i960 SOFTWARE DEBUGGER FEATURES 


• Windowed, pull down menu user 
interface shared by other i960 
Development Tools 

• Full symbolic debug with source level 
display allows C or assembly code 
debugging 

• Debugging productivity enhanced by 
ability to quickly browse source code and 
view call stacks or symbol run-time 
values 


• Breakpoints may be defined symbolically 
using module names, procedure names 
and line numbers 

• Single step execution, code assembly/ 
disassembly, memory and register 
display/modification 

• Run-time library support allows 
programs to access host files and perform 
I/O 


“lEM, PC/AT, and Personal System/2 are registered trademarks of International Business Machines Corporation. 
’Compaq is a registered trademark of the Compaq Corporation. 

’Intel is a registered trademark of the Intel Corporation. 
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FEATURES 


EASY TO USE, POWERFUL 
USER INTERFACE 

All i960 debuggers share the same high-level, 
powerful user interface as other i960 
development tools. Utilizing pulldown menus, 
users have access to a color, windowed 
environment featuring source-level, symbolic 
debugging. Multiple, non-overlapping windows 
can be used to display source code, registers, 
variable values, and command line entries. 

DEBUGGING FEATURES 

High-level source or disassembled code can be 
displayed in the source window. Users can 
scroll through the source, browse from module 
to module in a program, scope to any 
executable point in the source, or 
instantaneously relocate from a symbol name 
to the location where it was defined 
(hyperscope operation). Symbol names in the 
source can be highlighted to inspect the 
current run-time value of program variables. 
Call stacks can be examined to trace execution 
flow. 

A variety of breakpoints can be specified 
including source breakpoints, watch points, 
passpoints, or event-action breakpoints. 
Breakpoints can be defined symbolically using 
module names, procedure names and line 
numbers. Watch points allow users to observe 
a variable as it changes during program 
execution. Passpoints display a message when 
a specified instruction is executed, giving the 
user a non-realtime way to track execution of 
key code sequences without halting instruction 
flow. The event-action form allows complex 
breakpoint conditions to be set up, including 
data breakpoints (when supported by on-chip 
registers). 

Users can step through program execution via 
a single assembly language instruction, a high- 
level language statement or a high-level 
function or procedure. Memory can be 
displayed or modified as common data types 
and all processor registers and system tables 
can be examined or changed. 

Expressions involving symbol names, memory 
references, or both, can be defined as watch 
expressions whose values are monitored in a 
Watch window as a program executes. The 
i960 family of software debuggers also allows 
screen flipping between the debugger 
environment and the display output from the 
program. 


Low level, run time libraries are provided that 
allow programs running on an i960 board to 
access the file system on the host or to perform 
I/O operations. 

RETARGETABLE SOFTWARE 
DEBUGGER 

Intel’s DB-960 Retargetable Software 
Debugger is a combination application and 
system level debugger designed for use with 
the i960 family of embedded microprocessors. 
DB-960’s retargetable monitor can be 
customized to a target system, allowing source- 
level, symbolic debug across a serial interface 
cable. 

RETARGETABLE MONITOR 

Utilizing a combination of object files and 
source code, a retargetable monitor is provided 
with DB-960 for users to customize and 
incorporate into their proprietary target 
systems. This retargetable monitor is designed 
to support all members of the i960 family. Most 
of the monitor code is provided in object code 
and does not need to be changed. Hardware- 
dependent source code is supplied for 
modification by users. Example code is 
provided for porting the monitor to the Intel 
EV80960CA and QT960 target boards. Both 
boards use an Intel 82510 UART serial 
controller chip and the Intel 82C54 Counter/ 
Timer. 

HARDWARE DEBUG 

DB-960 takes advantage of on-chip debug 
registers like those found on the i960CA to ^ 
provide two hardware execution address 
breakpoints and two data address breakpoints. 
Once the monitor has been retargeted to the 
target system, hardware designers can 
download initialization code, read/write to 
registers and examine memory or register 
contents. 

HIGHSPEED SERIAL LINK 

DB-960 communications between the host and 
target system is supported via RS232 and 
RS422 communication links. RS232 allows 
access to industry standard serial protocols 
while the RS422 interface provides higher 
speed communication (up to 115K baud) for 
faster code and data download. PC- AT bus- 
compatible RS422 communication boards are 
available from various third party vendors. 
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FEATURES 


CUSTOMIZED ENVIRONMENT 

Because the user has control over the target 
board and serial driver source code, a highly 
customized target environment can be 
developed. Serial communication functions can 
be modified to allow for parallel 
communication schemes, allowing faster 
download speeds. 

LICENSING 

There are no incorporation or royalty fees for 
customers shipping the retargeted DB-960 
monitor with their product or system. 

PC-BASED SOFTWARE 
DEBUGGER 

The DB960KBDEVA Software Debugger is 
designed for debugging i960KA or i960KB code 
executing on an Intel EVA-960KB4MB 
Software Execution Vehicle plugged into PC- 
ATs or compatibles using DOS. 
DB960KBDEVA offers the same powerful 
debug user interface as other i960 softerware 
debuggers and utilizes I/O resources provided 
by the PC. Due to compatibility with the 
i960KA and i960KB, i960SA and i960SB code 
can be executed and debugged using the Intel 
EVA-960KB4MB Software Execution Vehicle 
in conjunction with the DB960KBDEVA 
Software Debugger. 

SIMULATOR-BASED 
SOFTWARE DEBUGGER 

The DBSIM960 Debug Simulator combines an 
i960 CA/KA/SA instruction-level simulator 
with the easy to use, powerful DB960 software 
debugger interface. Users can debug i960 
applications without a hardware target system 
being available, allowing products to get to 
market sooner. For i960 CA designs, 
performance information is provided, with 
timing profiles accurate to plus or minus 5%. 

Users can specify the target system’s clock 
speed and wait-state information for each 
region of memory.* DBSIM960 uses this 
information to provide i960 CA performance 
statistics. DBSIM960 expects COFF executable 
files generated by Intel’s CTOOLS960 compiler 
and assembler. Execution flow can be 
monitored by using a trace capability, which 
reports the 8 digit cycle address, 8 digit 
instruction pointer value, and the 
disassembled instruction for each operation. 


Program execution statistics reported 
include: 

• Total number of instructions executed 
® Total time 

® Number of times a call caused processor to 

® Current clock setting in cycles per second 
® Current wait-state setting for each of the 16 
memory regions 

® Number of instruction words executed from 
cache rather than external memory 

• Total number of cycles elapsed 

® Number of stack frames or register sets 
cached on chip 

® Number of times an unaligned load or store 
operation occurred 
® Bus utilization 
® Branch prediction efficiency 
® Usage for load, store, call and branch cache 
instructions 

Generally, DBSIM960 provides all the full 
symbolic, debug capabilities found in the i960 
family of debug tools, while providing a 
complete benchmarking environment prior to 
target system availability. 

*By being able to easily change the waitstate definition for 
their code, the user’s hardware and software design can be 
optimized before any hardware development takes place. 

IN CIRCUIT DEBUG MONITOR 

Intel’s DB960CADIC in-circuit debug monitor 
hosted on extended DOS/ 386 allows users to 
debug high-speed, cached applications at the 
full speed of the i960CA target processor. 
DB960CADIC can be used by both hardware 
and software developers, at any stage of design. 
Early in the development process, 
DB960CADIC allows software debugging when 
inserted into an existing i960CA board such as 
the EV80960CA, or in the DB960CASAST 
stand-alone self-test unit. Later in the design 
cycle, DB960CADIC can be inserted into the 
user’s target system, facilitating debug of 
hardware/ software integration. 

DB960CADIC offers the same, windowed debug 
user interface as other i960 software debuggers 
and is also available with an optional 4 MB 
standalone self test chassis to debug and test 
code before prototype hardware is available. 
For further information, see fact sheet 
#280900 from Intel. 
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FEATURES 


SOFTWARE COMPLETES THE 
SYSTEM 

Intel provides a comprehensive software 
development environment to complement DB- 
960. This environment includes a C Compiler, 
an i960 Assembler, a system generator for 
automating the compilation process and 
instruction-level simulators. The languages 
support the entire range of i960 embedded 
processors. 

WORLDWWE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, workshops, 
field application engineering expertise, hotline 
technical support, and on-site service. 


Intel also offers a Software Support Contract 
which includes technical software information, 
automatic distributions of software and 
documentation updates, iCOMMENTS 
publication, remote diagnostic software, and a 
development tools troubleshooting guide. 

Intel’s 90-day Hardware Support package 
includes technical hardware information, 
telephone support, warranty on parts, labor, 
material, and on-site hardware support. 

Intel Development Tools also offers a 30-day, 
money-back guarantee to customers who are 
not satisfied after purchasing any Intel 
development tool. 


SPECIFICATIONS AND REQUIREMENTS 


HOST SYSTEM REQUIREMENTS 

Host system requirements to run Intel’s i960 
family of software debuggers include the 
following: 

• DOS version 3.3 or later excluding DOS 4.0 

• 640K bytes of RAM in conventional memory 

• A fixed disk drive with at least 1.25M bytes 
of free disk space 

• One disk drive capable of reading 5.25 inch, 
360K byte disks 

• RS232 serial port (COMl or COM2) 


Evaluated Systems include: 

IBM PC-AT* with DOS 3.3 

COMPAQ 386* with DOS 3.3 

Intel 307302* with DOS 3.3 

IBM Personal System/2* Model 70/80 with 

DOS 4.01 
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ORDERING INFORMATION 


DB960KBDEV DOS-based, retargetable 
software debugger for the 
960KA, i960KB, i960SA, 
i960SB and i960CA 
embedded microprocessors. 
Includes host debug 
software, retargetable 
monitor, host I/O libraries 
and documentation. 

DB960KBDEVA DOS-based source level 
debugger for the i960KA, 
i960KB, i960SA and i960SB 
embedded microprocessors. 
Requires EVA-960KB4MB 
Software Execution Vehicle 
and PC- AT compatible bus. 

DBSIM960D DOS/386-hosted debug 

simulator for the i960 CA, 
i960 KA and i960 SA which 
utilizes an i960 CA 
instruction-level simulator 
allowing code development 
and debug prior to hardware 
prototype availability. 

DBSIM960S UNIX System V/386-hosted 

debug simulator for the i960 
CA, i960 KA and i960 SA 
which utilizes an i960 CA 
instruction-level simulator 
allowing code development 
and debug prior to hardware 
prototype availability. 


DBSIM960R IBM RS/6000-hosted debug 

simulator for the i960 CA, 
i960 KA and i960 SA which 
utilizes an i960 CA 
instruction-level simulator 
allowing code development 
and debug prior to hardware 
prototype availability. 

DB960CADIC DOS/ 386 hosted in-circuit 

debug monitor for i960CA 
only. Includes small board 
with i960CA processor, 
system debug monitor and 
serial interface. Plugs into 
i960CA socket on hardware 
prototype system. 

DB960CASAST Standalone Self Test Unit for 
DB960CADIC. Includes built- 
in power supply, self-test 
board, 4M byte of usable 
DRAM for code development 
and enclosure. 

To order your Intel Development Tool product, 
for more information, or for the number of 
your nearest sales office or distributor, call 
800-874-6835 (North America). For literature 
on other Intel products call 800-548-4725 
(North America). Outside of North America, 
please contact your local Intel sales office or 
distributor for more information. 
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EXV-960MC EXECUTION VEHICLE 



, 280879-1 

80960MC-BASED TARGET SYSTEM SUPPORTING EARLY 
SOFTWARE DEVELOPMENT AND BENCHMARKING 

EXV-960MC is a software execution vehicle designed to support 80960MC-based designs. 
Users can use the EXV-960MC board to execute and debug their application software 
before a functional hardware prototype is available. The EXV-960MC is also designed 
with programmable waitstate SRAM to support benchmarking activities. The EXV- 
960MC is supported by the complete set of Intel C, assembler and Ada code generation 
tools. Both of the VAX/VMS*-hosted 80960MC software debuggers, the SDM-960MC 
system debug monitor and the Ada-960MC source-level debugger, can be used for 
debugging software running on the EXV-960MC. 

EXV--960MC includes a Multibus I form factor board and a set of SDM-960MC target 
monitor EPROMS. The SDM-960MC and the Ada-960MC debugger are preconfigured to 
support the EXV-960MC execution environment. Designers can select the software 
debugger best suited to their development needs. The Ada-960MC debugger is a source- 
level symbolic debugger which provides a productive debugging environment for Ada 
applications. The SDM-960MC debug monitor offers a complete debugging facility for 
applications written in C, assembler or Ada. 


* VAX/VMS is a trademark of Digital Equipment Corp. 
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SDM-960MC RETARGETABLE SYSTEM DEBUG 

MONITOR 


FEATURES 

• 25 MHz 80960MC processor 

• 256 Kbytes of (0,0, 0,0) programmable wait-state SRAM 

• 4 Mbytes dual-ported (3, 1,1,1) wait-state DRAM 

• iSEX"^^ interface 

• Two serial ports, one bi-directional parallel port 

• 8254 programmable interval timer 

• 8259A programmable interrupt controller 

ELECTRICAL CHARACTERISTICS 

10 A @ 4-5V 
50mA @ + 12V 
50mA @ -12V 

ENVIRONMENTAL CHARACTERISTICS 

Operating temperature: 0° to + GO^C (32° to 140°F), 300 LFM 
Operating Humidity: 10% to 90% non-condensing 

SOFTWARE DEBUGGING SUPPORT 

The SDM-960MC is a VAX/VMS*-hosted system debug monitor that provides a complete, flexible 
environment to execute and debug 80960MC-based applications. Users can tailor the execution 
environment as software development evolves. Initially, the application may require the full 
support of the system debug monitor to establish a run-time environment. As the application 
evolves, the SDM-960MC allows the application to take more of the responsibility for system 
functions. 

The default execution environment of the SDM-960MC is the EXV-960MC execution vehicle.^he 
VAX-hosted portion of the SDM-960MC debug monitor provides complete on-target debugging 
support through its interface with the target-resident portion of the SDM-960MC. To facilitate 
debugging on a user’s custom target system, the SDM-960MC includes source and object files 
necessary to reconfigure the target monitor. SDM-960MC and other 80960MC development tools 
allow the developers to take full advantage of the 80960MC processor. 

FEATURES 

• assemble and disassemble 80960MC 
instructions 

• single step program execution 

• access to memory and processor resources 

• support 64 execution breakpoints 

• issue Interagent Communications (lACs) 

• powerful execution trace 

• serial download 

HARDWARE REQUIREMENTS 

• a serial interface 

• 25 Kbytes of EPROM 

• contiguous 50 Kbytes of RAM 

hosted System Debug 
Monitor, retargetable source 
is included 


WORLD WIDE SER VICE AND 
SUPPORT 

Intel augments its 80960 architecture family 
development tools with a full array of 
seminars, classes, and workshops; on-site 
consulting services and telephone support are 
available at all stages of development. 

ORDERING INFORMATION 

Product Code Description 

EXV960MC 80960MC execution vehicle 
(board and target EPROM) 

SDM960MC VAX, MicroVAX/VMS 
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80960SA/SB DEVELOPMENT SUPPORT 



280906-1 


COMPREHENSIVE DEVELOPMENT SUPPORT FOR 80960SA/ 
SB EMBEDDED APPLICATIONS 

Intel provides comprehensive development support for the 80960 component 
architecture, including the newest members, the 80960SA and 80960SB. Tools range from 
compilers to simulators and from debuggers to emulators. All designed specifically for 
members of the 80960 family, allowing you to take full advantage of their RISC-based 
design while reducing time to market. 

DEVELOPMENT TOOLS A VAILABLE: 

• ASM-960 macro assembler for 
developing and tuning speed-critical code 

• iC-960 highly optimizing C language 
compiler for high-level language 
software development 

• GEN-960 system generator for 
initializing your design to take 
advantage of 80960 on-chip features 

• DB/SIM960KA debug simulator for 
80960KA and 80960SA applications 


• Windowed, interactive, source-level DB- 
960 debugger which can be targeted to 
one of the evaluation and development 
boards below, or customized to your 
target system 

• Evaluation and development boards 
including the EV960SB, the QT80960KB, 
and the EVA960KB 

• ICE-960SA/SB offers a full featured in- 
circuit emulator for the 80960SA/SB 
components 
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80960SA/SB DEVELOPMENT SUPPORT 



ASM-960 MACRO ASSEMBLER 

The ASM-960 macro assembler is used to fine- 
tune sections of code for peak program 
execution speed on the 80960SA, 80960SB, 
80960KA, 80960KB, 80960MC, and 80960CA. 
ASM-960 does this by giving you absolute 
control over program instructions. In addition 
to the assembler and macro preprocessor, 
ASM-960 includes several utilities for 
application program maintenance and debug: 

• LINKER provides incremental program 
linking/locating and link-time optimization. 

• ARCHIVER allows you to build reusable 
function libraries for applications. 

• DISASSEMBLER produces assembly 
language from object files. 

• SYMBOL DUMPER provides symbolic 
information from a program file for 
facilitating low-level debug. 

• ROM IMAGE BUILDER produces a hex file 
suitable for PROM programmers. 

• Macro preprocessor provides code generation 
flexibility and improves code readability, 
reducing maintenance costs. 

A Floating Point Arithmetic Library (FPAL) is 
included for the 80960SA, 80960KA, and 
80960CA components. It eliminates the need to 
develop your own floating point code. 


GEN-960 SYSTEM GENERATOR 

The 80960 System Generator (GEN-960) helps 
you set up data structures for standalone, 
embedded applications that use the on-chip 
features of the 80960 architecture. GEN-960 is 
used with other 80960 tools to generate and 
refine ROM or RAM code. GEN-960 supplies a 
set of command and template files containing 
assembly code and linker control commands to 
set up processor control blocks, inter-agent 
communication mechanisms, system procedure 
tables, and other requirements for 
initialization. The result is a batch file 
containing all the commands needed to 
compile, assemble and link the final target 
system. 

• Improves engineering productivity by 
automating the compilation, assembly and 
linking process 

• Supplies sample initialization code, reducing 
programming time 

• Save engineering time by simplifying the 
task of initializing each processor for on-chip 
capabilities 
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80960SA/SB DEVELOPMENT SUPPORT 


iC-960 COMPILER 

iC-960 is a highly optimizing C language 
compiler for the 80960 family of 
microprocessors. iC-960 supports the full C 
language as described in the Kernighan and 
Ritchie book, The C Programming Language 
(Prentice-Hall, 1978). iC-960 includes standard 
ANSI extensions to the C language and is used 
in conjunction with ASM-960 for creating 
object files. 

The iC-960 compiler supports a number of 
processor dependent optimizations including 
global register allocation, constant 
propagation, arithmetic identity folding, 
redundant load/ store elimination, strength 
reduction and register allocation/scheduling of 
arguments. Processor independent 
optimizations include common sub-expression 
elimination, folding of constant expressions, 
elimination of superfluous branches, removing 
unreachable code, tail recursion and procedure 
incorporation. 

iC-960 includes a standard C library with I/O 
functions and mathematical routines. A second 
library provides low level, environment- 
dependent routines emulating UNIX* system 
calls and supplies I/O routines for the EVA- 
960 Software Execution Vehicle. 

iC-960 also includes the following 
enhancements for embedded application 
development: 

Programs may be easily placed in ROM. 
Memory-mapped I/O allows high-level 
language access to application-specific input 
and output. 

In-line assembly simplifies the integration of 
C language and assembly code for speed- 
critical functions. 

Floating point support produces in-line code 
to take full advantage of the floating point 
capability of the 80960SB, 80960KB and 
80960MC. 

Symbolic debugging of source code for iC-960 
and ASM-960 is provided by the DB-960 Source 
Level Debugger, the DBSIM960KA debugging 
simulator, the DB960CADIC in-target 
debugger, and the ICE960SB and ICE960KB 
emulators. 


DEBUGGING SIMULATOR 

The DBSIM960KA simulator features an easy 
to use, pulldown menu user interface combined 
with an 80960SA/80960KA instruction 
simulator. DBSIM960KA facilitates debugging 
80960SA and 80960KA applications by 
providing debugging capabilities before target 
hardware is available. DBSIM960KA’s 
powerful, windowed, source-oriented interface 
allows you to focus your efforts on finding bugs 
rather than on learning and manipulating the 
debug environment. 

Ease of learning. Drop-down menus make the 
debugger easy to learn for new or casual users. 
A command line interface allows direct 
command entry for solving more complex 
problems, improving productivity of 
knowledgeable users. 

Extensive debug modes. You can set 
conditional breakpoints, pass points, and 
temporary breakpoints as needed. 

See into your program. Using pull-down 
menus or function keys, you can browse source 
and Call stacks, monitor processor registers, 
view screen output, and watch the values of 
variables change. 

Full debug symbolics for maximum 
productivity. You need not know whether a 
variable is an unsigned integer, a real, or a 
structure: the debugger displays program 
variables in their respective type formats. 
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80960SA/SB DEVELOPMENT SUPPORT 


EVA-960KB4MB SOFTWARE 
EXECUTION VEHICLE 

The EVA-960KB4MB is a software execution 
vehicle for the 80960KA/KB microprocessor. It 
is a single PC AT plug-in board which provides 

CCnve^^e^t S.^’^Li+on'I'vir'o mzalnQ+inn 

and benchmarking, as well as software 
development. Since the board uses an 
80960KB, 80960SA and 80960SB performance 
can be extrapolated. The EVA-960KB4MB 
contains the following: 

• 4 MB or 16 MB (EVA960KB16MB) of one 
wait-state program memory (DRAM) 

• 64 Kbytes of zero wait-state program 
memory (SRAM) 

• Three-channel programmable interval timer 

SOURCE-LEVEL DEBUGGER 

The DB-960 Debugger with source-level debug 
capabilities is available for PC ATs equipped 
with DOS. DB-960 can debug 80960 code 
executing on an Intel EVA-960 Software 
Execution Vehicle or on a hardware target 
system via a serial interface. The. EVA-960 
targeted debugger uses I/O resources provided 
by the PC, while 80960 code executes at high 
speed on the EVA-960. Two serial versions of 
DB-960 are available. DB-960CADIC plugs 
directly into the 80960CA socket on your 
prototype, offering a '"plug-in and go” debug 
environment. DB-960D is a serial, retargetable 
version of DB-960 whose system debug monitor 
can be customized for 80960SA/SB, 80960KA/ 
KB, or 80960CA operation. 

Ease of learning. Drop-down menus make the 
debugger easy to learn for new or casual users. 
A command line interface allows direct 
command entry for solving more complex 
problems, improving productivity of 
knowledgeable users. 


• Hosted debug monitor which supports two 
hardware and 64 software breakpoints, 
single-step program execution, register and 
memory access, program download and 
upload 

• DOS access libraries that allow: screen 
display, K.eyuuard iiipuL, icau and wiitc disk 
files, and the ability to spawn a DOS process 
that could communicate with serial or 
parallel I/O 

• 20 MHz operation, allowing software to 
operate at full speed of 80960KB 

EVA-960KB4MB also operates with the DB- 
960 Source Level Debugger for code 
development/ debug prior to target system 
availability. 

Extensive debug modes. You can set 
conditional breakpoints, pass points, and 
temporary breakpoints as needed. 

See into your program. Using pull-down 
menus or function keys, you can browse source 
and Call stacks, monitor processor registers, 
view screen output, and watch the values of 
variables change. 

Full debug symbolics for maximum 
productivity. You need not know whether a 
variable is an unsigned integer, a real, or a 
structure: the debugger displays program 
variables in their respective type formats. 

In-Target Debug. Porting the DB960D 
retargetable monitor to your target system 
allows the debugger to be used in-target, thus 
facilitating debugging of code dependent upon 
hardware interaction. 
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80960SA/SB DEVELOPMENT SUPPORT 


ICE960SB IN-CIRCUIT 
EMULATOR 

ICE960SB is a full featured in-circuit emulator 
for the 80960SA and 80960SB components. A 
separate ICE probe can be purchased to 
support 80960KA and 80960KB components. 
ICE960SB includes: 

Full speed emulation of the 80960SA/SB 
components to 16 MHz 

• Complete symbolic information when used 
with Intel 80960 compilers 

• 1024 Frames Bus or Execution Trace with 
Time-Tags 

• Comprehensive break capabilities including 
execution addresses, instruction type, bus 
read/write/access, data values, and external 
synch lines 

WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, and 
workshops, field application engineering 
expertise, hotline technical support, and on- 
site service. 

Intel also offers a Software Support package 
which includes technical software information. 


• Qualification of break conditions based on a 
8-state machine or an occurrence counter 

• Fastbreaks to dynamically access memory or 
variables during emulation 

• Examine and modify memory and 80960 
registers 

• Stand-Alone Self-Test module provides 
diagnostic circuitry and 256 Kbytes of 
memory for software development 

• Optional 2 Mbyte of relocatable expansion 
memory 

• Support for socketed and surface mounted 84 
Pin PLCC components and surface mounted 
80 Pin EIAJ components via ONCE mode 

• DOS Hosting with support for RS232 and 
RS422 communication links 


telephone support, automatic distribution of 
software and documentation updates, access to 
the ”ToolTalk” electronic bulletin board, 
'HComments” publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 

Intel’s Hardware Support package includes 
technical hardware information, telephone 
support, warranty on parts, labor, material, 
and on-site hardware support. 
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80960SA/SB DEVELOPMENT SUPPORT 


80960SA/SB DEVELOPMENT 
TOOLS 

ASM960 Assembler package 

containing the assembler, 
linker/ loader, macro 

■r»v*rkT-»-r*r\/^^^oor>v o-r*r»V»lX7’0'»» 

ROM image builder, other 
object file utilities, and the 
80960SA/KA/CA floating 
point arithmetic library. 

C960 Optimizing C Compiler, 

with ANSI extensions for 
embedded control 
applications; contains 
standard STDIO libraries 
and in-line assembly 
capability. 

GEN960 80960 System Generation 

software automates the 
compilation, assembly and 
linking process. Simplifies 
usage of 80960 sophisticated 
features. 


DBSIM960KA Debugging Simulator 
software emulates the 
80960SA and 80960KA 
instruction set allowing 
code development and 
debugging prior to 
hardware prototype 
availability. 

DB960KBDEVA Source Level Debugger 

software for the 80960KB/ 
KA with powerful debug 
capabilities including 
conditional breakpoints, 
source and Call stack 
browsing, memory/ register 
display and modification, 
and ability to watch 
variables change value. 
Requires EVA-960KB4MB 
Software Execution 
Vehicle. For PC AT hosted 
systems only. 


DB960D Source Level Debugger 

software for 80960SA/SB, 
80960KA/KB, or CA 
processors resident on 
serially-interfaced 
hardware prototype 
sysleiiis. Includes 
customizable system debug 
monitor and serial interface 
protocol specifications. For 
PC AT hosted systems only. 

EVA960KB4MB Software Execution Vehicle 
for 80960SA/SB and 
80960KA/KB components. 
Includes 4 Mbyte of on- 
board memory, system 
debug monitor and code 
download software. Code 
compatible with the 
80960SA/SB components. 
Required by 
DB960KBDEVA. 

EVA960KB16MB Identical to 

EVA960KB4MB with 
16 Mb 3 d;e of DRAM instead 
of 4 Mbyte. 

ICE960SB In-Circuit emulator for the 

80960SA/SB components. 
Includes ICE base and 
probe, stand-alone self-test 
module, and your choice of 
PLCC or PQFP target 
adapters. Optional 2 Mbyte 
relocatable expansion 
memory option provides 
overlayable memory for 
software prototyping and 
hardware debugging. 
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80960SA/SB DEVELOPMENT SUPPORT 


ARCHITECTURE EVALUATION 
STARTER KITS 

960SKit3 Contains ASM960D Assembler 
and iC960D Compiler 

DB960KIT2 Kit contains DB-960KBDEVA 
(KB version of DB-960 used with 
EVA-960), EVA960KB4MB 
Software Execution Vehicle, 
ASM960D and C960E. Requires 
PC AT with 640K memory. 


DB960KIT3 Kit contains DB-960D (serial 

version of DB-960 supporting the 
80960SA/SB, 80960KA/KB and 
80960CA components (operating 
on PC-AT/DOS), ASM960D and 
C960D. Requires PC AT with 
640K memory. 


Product Code to order, by Host 

Product 

PC-AT/DOS 

UNIX-386 

OS/2 

Sun 3/ 

HP9000/ 

VAX/ 

jaVAX/ 

Category 


V.4 


UNIX 

HP-UX 

ULTRIX 

ULTRIX 

Assembler 

ASM960D 

ASM960S 

ASM960P 

ASM960U 

ASM960H 

ASM960VX 

ASM960MX 

C Compiler 

C960D 

C960S 

C960P 

C960U 

CP60H 

C960VX 

C960VX 

System Gen 

GEN960D 

— 

— 

GEN960U 

GEN960H 

— 

__ 

SX Debugger 

DB960D 

— 

— 

— 

— 

— 

— 

KX Debugger 

DB960KBDEVA 

— 

— 

— 

— 

— 

— 


DB960D 

— 

— 

— 

— 

— 

— 

CA Debugger 

DB960CADIC 

— 

— 

— 

— 

— 

— 


DB960D 

— 

— 

— 

_ 

— 

_ 

SA Simulator 

— 

DBSIM960KAS 

— 

— 

— 

— 

— 

CA Simulator 

SIM960CAD 

— 

— 

SIM960CAU 

SIM960CAH 

— 

— 

ICE960SB 

ICE960SB 

— 

— 

— 

— 

— 

— 

ICE960KB 

ICE960KB 

- 

- 

- 

- 

- 

- 
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ICETM.960SB AND ICE-960KB 
IN-CIRCUIT EMULATOR 


INTERCHANGEABLE PROBES 

The ICEtm- 960 in-circuit emulator delivers real-time hardware and software debugging 
capabilities for i 960 TM SA/SB and i960 KA/KB-based designs. Features include full- 
speed emulation of each of the microprocessors, powerful breakpoint specification, 
fastbreaks, optional relocatable expansion memory, two types of trace capability, large 
trace buffering, sophisticated human interface and high-speed communication links with 
the DOS host. The ICE-960 in-circuit emulator gives you unmatched control over all 
phases of hardware/software debug, including developing, integrating and testing, which 
improves development productivity and improves time to market. 

FEATURES 

• Real-Time Emulation of the i960 KA/KB 
microprocessors up to 25 MHz and 
emulation of the i960 SA/SB to 16 MHz 

• Full symbolic integration with Intel 
ASM and C compilers 

• Optional ICE960KBREM/ 

ICE960SBREM boards provide 2 Mbytes 
of ICE memory which can overlay user 
ROM or RAM. 

• Examine and modify memory and the 
i960 registers 


October 1991 

5-15 Order Number: 280862-003 


• Dynamically monitor and update 
program variables via fastbreaks 

• Breakpoint capabilities include: 
execution address, instruction type, bus 
read/write/access, and data value. 
Qualification of events is based on an 
occurrence counter and an 8-state states- 
machine 
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FEATURES 


• Hosted on IBM PC AT* or compatible and 
supporting RS232, RS422 and Ethernet 
operation 

• 1024 frame trace buffer for execution and/or 
bus trace and time tags 

• The on-chip cache does not effect collection of 
the execution trace 

• 256 Kbytes of memory in standalone self-test 
(SAST) unit 

• Real-time bus trace with time-tags for 
tracking code execution time 

• Assembly and disassembly of code in i960 
instruction mnemonics 

• ICE to component interconnect includes 
support for surface-mounted and socketed 84- 
pin PLCCD and surface mounted 80-pin 
EIAJ QFP i960 SA/SB and 132-pin PGA for 
i960 KA/KB 

The ICE-960 in-circuit emulator provides 
emulation of the i960 SA/SB at speeds to 
16 MHz and the i960 KA/KB at speeds to 
25 MHz, thus providing early detection of 
subtle timing problems that may arise at full 
speed. Intel’s intimate knowledge of the 
component makes possible the tightest 
conceivable conformance between timing 
parameters of the emulator and the target 
microprocessor. 

PROCESSOR/MEMOR Y 
EXAMINA TIONAND 
MODIFICATION 

The i960 registers can be accessed 
mnemonically (e.g. gl2, r5, fp3) with the ICE- 
960 emulator software. Data can be displayed 
or modified in hexadecimal, decimal, octal, or 
binary and by data type (byte, word, etc). 
Program memory contents can be modified as 
i960 assembly instruction mnemonics. 

PROGRAM TRACING 

The ICE-960 emulator can store 1024 frames of 
program execution history processor/ address/ 
data bus activity in the trace buffer. Each 
frame of program execution contains a 
discontinuity address (branch, call, return, etc) 
and a time-tag. This information can be used to 
reconstruct a history of the program execution. 
With the execution trace option enabled, the 
ICE-960 will run at less than full speed. Each 
trace frame of bus cycles contains one complete 
bus burst trace. Collection of trace information 
is controlled by a logic analyzer type moving 
trace window and by bus access type. 


EVENT RECOGNITION 
(BREAKPOINT CONTROL) AND 
EMULA TION CONTROL 

ICE-960 provides comprehensive event 
recognition capabilities including: two 
hardware and thirty-two software breakpoints 
for instruction execution breakpoints, and use 
of the internal debug registers to recognize 
execution of certain instruction types such as 
branch or call instructions. Bus analysis logic 
provides recognition of external bus addresses 
qualified by read, write, or access type as well 
as data values. The data values may be entered 
as masked values and qualified by type. Two 
synchronization lines are provided for 
recognition of external events. ICE-960 also 
provides qualification of events based on an 
occurrence counter or by a recognition 
sequence of up to 8 events. Additionally, 
emulation can be automatically stopped when 
the trace buffer is full. Besides the ability to 
execute program code at full speed between 
specified points, the ICE-960 emulator provides 
the capability to single-step through program 
code. 

RELOCATABLE EXPANSION 
MEMORY 

An optional board provides ICE-960 with 2 
Mbytes of relocatable expansion memory 
which allows users to develop applications 
either before the target system memory is 
working, or in place of ROM or EPROM to 
speed the debugging cycle. This memory can be 
mapped in two separate 1 Mbyte partitions on 
1 Mbyte boundaries. 

For the new ICE960KBREM board, the 
memory waitstate pattern is (3, 1,1,1) when the 
users system does not return RDY # for 
accesses in the mapped area. For accesses 
where the user system does return RDY # for 
these areas, the waitstate pattern will be the 
larger of (3,1,1,1) or user waitstate pattern plus 
(2,2, 2, 2). For either board, the size and shape of 
the board is identical to the ICE probe and is 
installed between the probe and the user’s 
target system when in use. The memory 
configuration can be mapped via an ICE MAP 
command. 

The ICE960KBREM/ICE960SBREM cards add 
some constraints when used with the ICE in a 
users target system. First, users should qualify 
bus drivers/buffers with DEN # in order to 
eliminate potential bus conflict between the 
REM board and their target memory while 
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FEATURES 


using the ICE. Second, the IM Byte partition 
size can not be reduced and may effect the 
design of the users memory subsystem. Third, 
the REM boards delay the ADS# and DEN # 
signals by 5 ns (typical) and delays the RDY # 
signal by 4 ns (typical). Fourth, it adds loading, 
capacitance, and power requirements as shown 
in tables 3 and 4. 

STANDALONE OPERATION 

Product software can be developed and 
debugged prior to and independent of 
hardware availability with the Standalone Self 
Test unit (SAST), which contains 256 Kbytes of 
two wait-state program memory. The SAST 
also provides diagnostic testing to assure full 
functionality of the ICE-960 emulator. 

VERSATILE AND POWERFUL 
HOST SOFTWARE 

ICE-960 provides an easy-to-use human 
interface which utilizes color forms to 
complement a powerful command set. The 
software includes: an on-line help facility, a 
dynamic command entry and syntax guide, 
screen oriented editor, assembler and 
disassembler, input/ output redirection, 
command piping, DOS command entry, and 
the ability to customize the command set via 
debug procedures and literal definitions. 

DEBUG PROCEDURES AND 
LITERALS 

Debug procedures (PROCs) are user-defined 
groups of ICE960 emulator commands. They 
can be stored on disk and recalled during later 
debugging sessions. PROCs can be used to 
simplify the process of debugging by grouping 
repetitive emulator commands, which can then 
be accessed by typing the name of the PROC. 
Literals are user-defined abbreviations for 
whole or partial ICE-960 emulator commands. 
Literals are a shorthand method of 
customizing the emulator commands to fit 
your needs and preferences. 


ICE TO COMPONENT 
INTERCONNECT SYSTEM 

Using the On-Circuit Emulation (ONCE) i960 
SA/SB silicon feature, ICE960SB can be used 
in systems with surface-mounted i960 SA/SB 

in PT.Pr^ nr* PTA.T OPP 

w ^ ^ . ^ 

packages. The hinge cable adapters included in 
the various ICE kits and pictured to the right, 
are placed directly on top of the surface 
mounted i960 SA/SB device. The circuitry 
necessary for the emulator to take control 
from the target processor is fully supported in 
the emulator. No additional circuitry is 
required. 

Of course, socketed support for i960 SA/SB 
components in PLCC packages, or i960 KA/KB 
components in PGA packages are also 
supported. Please see Figures 1, 2, 3, and 4 for 
ICE Probe physical characteristics. Refer to 
Table 5 for hinge cable loading and delay 
characteristics. 

WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, workshops, 
field application engineering expertise, hotline 
technical support, and on-site service. 

Intel also offers a Software Support contract 
which includes technical software information, 
automatic distributions of software and 
documentation updates, iCOMMENTS 
publication, remote diagnostic software, and a 
development tools troubleshooting guide. 
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FEATURES 


HIGH-SPEED HOSTTOICE 

COMMUNICATIONS 

PROTOCOLS 

ICE-960 supports RS232 and RS422 
communications protocols to 115 KBaud and 
1152 KBaud respectively depending upon the 
ability of the host to support the specific rate. 
Testing for these systems and the 
configurations involved are described in the 
following sections. 


SPECIFICATIONS 


Intel’s 90-day Hardware Support package 
includes technical hardware information, 
warranty on parts, labor, material, and on-site 
hardware support. 

Intel Development Tools also offers a 30-day, 
money-back guarantee to customers who are 
not satisfied after purchasing any Intel 
development tool. 


HOST REQUIREMENTS 

IBM PC- AT (minimum requirements) with 640 
KB 3 rtes of conventional memory 

1 MByte of RAM (Lotus, Intel, Microsoft 
expanded memory specification) 

20 MByte Fixed Disk 

At least one 5 Vi" or 3 ^ 2 " Floppy Disk drive 
RS232 or RS422 Communication Interface 
DOS Operating System (version 3.2 or 3.3) 

TESTED HOST 
CONFIGURATIONS 

IBM PC- AT with DOS 3.3. Tested with built- 
in RS232 and a Quatech DS202 Asynchronous 
RS422 Communications Board with 16550 
Option 

MECHANICAL SPECIFICATIONS 


COMPAQ Deskpro 386* with DOS 3.3. 
Tested with built-in RS232 and Quatech DS202 
Asynchronous RS422 Communications Board 
with 16550 Option 

Systems Based on an Intel 301/302TM Box 
with DOS 3.3. Tested with built-in RS232 to 
115.2 KBaud and a Quatech DS202 
Asynchronous RS422 Communications Board 
with 16550 Option to 1.152 MBaud 

IBM Personal System/2* with DOS 4.01. 
Tested with built-in RS232 

REQUIRED SYSTEM 
RESOURCES 

The ICE-960 emulator requires the following: 
a) exclusive use of the i960 SA/SB or i960 KA/ 
KB’s on-chip debug registers and b) a 
minimum of 256 bytes of target system RAM 
used to flush the i960 local registers. 


TABLE 1. ICE-960 Emulator Physical Characteristics 


Unit 

Width 

Height 

Length 

Weight 

Inches 

cm 

Inches 

cm 

Inches 

cm 

lbs 

kg 

Control Unit 

10.5 

26.7 

1.5 

3.8 

16.0 

40.6 

6.0 

2.72 

Processor Module* 

3.8 

9.6 

1.5 

3.8 

5.0 

12.7 



SAST 

6.0 

15.2 

2.0 

5.1 

8.0 



1.59 

OIB 

CO 

cd 

9.6 

0.9 


5.1 

13.0 




2.8 


4,2 

10.7 

11.0 

27.9 

4.7 

2.14 






22.0 

55.9 








12.0' 

3.66m 




•Measurement includes target adaptor 
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Figure 2; Optional Isolation Board 
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SPECIFICATIONS 


PLCC Htnga Cable Dimensions 



Required Clearanoe for Surface 
Mount Components 

Ail Measurements in Centimeters 


Figure 3: ICE960SB16C Adapter 
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AC /DC Specifications 

The Optional Isolation Board (OIB) isolates the 
ICE-960 probe from an untested user target 
system. When the OIB is in use, the ICE-960 
AC and DC specifications differ from the i960 
microprocessor as shown below. vVhen the OIB 
is not ins talled, the ICE-960KB timing 
specifications are identical to those of the i960 
component. 


TABLE 2. AC Specifications with tlie OIB Installed 




16 MHz 

25 MHz 

Symbol* 

Parameter 

80960SB 

80960KB 



Min 

Max 

Min 

Max 

T1 

Clock Period 

32 ns 

125 ns 

20 ns 

125 ns 

T2 

Clock Low Time 

9 ns 


6 ns 


T3 

Clock High Time 

9 ns 


6 ns 


T4 

Clock Fall Time 


10 ns 


10 ns 

T5 

Clock Rise 


10 ns 


10 ns 

T6 

Output Valid Delay 






A(2:3), BE#(0:1), BLAST#,* 


40 ns 


33 ns 


DEN#,DTR#,WR#** 

A/D Lines*** 


40 ns 


33 ns 

T6 AS 

AS Valid Delay (AS#) 


36 ns 


33 ns 

T7 

ALE# Width 

16 ns 


12 ns 


T8 

ALE# Valid Delay 


36 ns 


33 ns 

T9 

Output Float Delay 






A(2:3), BE#(0:1), BLAST#,* 
DEN#,DTR#,WR#** 


50 ns 


35 ns 


A/D Lines 


50 ns 


40 ns 

TIO 

Input Setup 1 






HLDA, INTO#, INTI, INT2, INT3# 

13 ns 


6 ns 


Til 

Input Hold 






HLDA, INTO#, INTI, INT2, INT3# 

10 ns 


13 ns 



HOLD, READY#, LOCK# 

10 ns 


13 ns 


T12 

Input Setup 2 






HOLD, READY#, LOCK# 

17 ns 


11ns 


T13 

Setup to ALE# Inactive 

7 ns 


7 ns 


T14 

Hold after ALE# 

5 ns 


5 ns 


T15 

RESET# Hold 

4 ns 


4 ns 


T16 

RESET# Setup 

4 ns 



IBHi 

T17 

RESET# Width 

1281 ns 


820 ns 



*TpLH dependent on termination for KB control signals 
**OIB does not float A/D bus during Tr and Tj (between bus cycles) 

** ♦Output Valid Delay for control signals after HOLD ACKNOWLEDGE is deasserted 50 ns for 80960SB and 43 ns for 80960KB 


ELECTRICAL SPECIFICATIONS 

SYNC Line Specification 

The SYNCIN line must be valid for at least one 
instruction cycle because it is only sampled on 
bus access boundaries. The SYb^^Ib^ ^ 
standard TTL input. The SYNCOUT line is 
driven by a TTL open collector with a 4.75 Kfi 
pull-up resistor 
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TABLE 3. ICE-960 Emulator DC Specifications 



ICE Probe 

OIB 

REM 

Processor Speed 

ICE960SB 

1.4 

0.4 

0.5 

16 

ICE960KB 

1.4 

0.6 

0.7 

25 


TARGET SYSTEM DESIGN 
CONSIDERATIONS 

In addition to the mechanical, power 
consumption, and signal loading 
considerations for the ICE probe, the following 
points should be taken into account when the 
target system is being designed: 

1) [SA/SB/KA/KB/MC] 

The AD bus should not be driven by an 
external source unless DEN # is asserted. 

2) [SA/SB/KA/KB/MC] 

The LOCK# signal must be terminated as 
recommended in the 80960SA/SB 
component data sheet. 


3) [SA/SB/KA/KB/MC] 

To guarantee timings, the ICE requires 
± 5% supply voltage to the target system 
(i.e., ICE probe power). 

4) [SA/SB] 

To ensure correct bus trace the ICE requires 
a data hold time (Til) of 4 ns. 

5) [SA/SB/KA/KB/MC] 

Each Vcc and GND pin of the processor 
must be connected to the appropriate 
voltage or ground and externally strapped 
close to the package. 

6) [SA/SB/KA/KB/MC] 

Processor no connect (N.C.) pins must be 
left disconnected. 
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TABLE 4. Additional DC Loading 


Signal 

(ICE Probe) 

(OIB) 

(KB REM) 

(SB REM) 

IlH 

Max 

IlL 

Max 

IlH 

Max 

IlL 

Max 

IlH 

Max 

IlL 

Max 

IlH 

Max 

IlL 

Max 

AD(0:31) 

25 }xA 

25 jxA 

15 /xA 

-15 jlxA 

120 jliA 

0.7 mA 

20 jjlA 

100 jiA 

ADS# 

25 julA 

25 jjlA 

115 jxA 

-15 jxA 

Driven by 74AS760 
w/ 4.7k Pull-Up 

10 jjlA 

10 jxA 

DEN# . 

25 julA 

25 jjlA 

115 jxA 

— 15 jjlA 



W/R# 

25 fiA 

25 jjlA 

115 jxA 

-15 jixA 

150 jiiA 

1.7 mA 

10 jjlA 

10 jllA 

CLK2 

50 jllA 

500 jjlA 

25 jjlA 

— 25 jjlA 

130 jliA 

2.9 mA 

20 jjlA 

1600 jjlA 

RESET 

25 jjlA 

250 jjlA 

45 jjlA 

-750 /xA 

250 jjlA 

0.3 mA 

10 jjlA 

10 juA 

BE(0:3)# 

25 fjiA 

25 jjlA 

115 jjlA 

— 15 jjlA 

lO/xA 

0.1 mA 

10 jixA 

10 jiA 

READY# 

25 fiA 

25 jjlA 

45 jjlA 

-750 jjlA 

750 jliA 

0.8 mA 

25 jjlA 

260 jxA 

ALE# 

25 /xA 

25 jjlA 

15 jjlA 

-15 jxA 

20 jjlA 

0.5 mA 

10 jiA 

1600 jiA 

DT/R# 

25 julA 

25 jjlA 

115 fiA 

-15 jliA 





INT(0:3) 

25 jjlA 

25 jjlA 

15 jLiA 

— 565 jjlA 





BAD AC# 

25 jliA 

25 jjlA 

15 julA 

— 565 jm A 



' 


LOCK# 

25 jjlA 

25 jxA 

140 )xA 

— 500 jjlA 





HOLD 

25 jLiA 

25 jjlA 

45 jiA 

-750>aA 





FAILURE# 

25 /xA 

25 jjlA 

20 jjlA 

-1mA 






TABLE 5. 80960SB PLCC Hinge Cable Loading and Delay 


Signal Loading 

15 pF Typical 

Signal Delay 

Signals from Processor delayed 4 ns typical, Setup and Hold Timings unaffected. 
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ORDERING INFORMATION 


Order Code 
ADPT80EIAJ 


ADPT84PLCC 


ICE960SB16C 


ICE960SB16J 


ICE960KB25 


Description 

Hinge Cable Adapter for 
surface-mount i960SB EIAJ 
QFP packages. This adapter is 
included in the ICE960SB16J 
kit. 

Hinge Cable Adapter for 
surface-mount and socketed 
i960SB PLCC packages. This 
adapter is included in the 
ICE960SB16Ckit. 

ICE960base, i960 SA/SB 
probe, 84-pin PLCC surface- 
mount and socketed target 
component interconnect, and 
RS232 and RS422 
communication cables. 
(Shrink-Wrap license. Class 1) 

ICE960 base, i960 SA/SB 
probe, 80-pin EIAJ surface- 
mount target component 
interconnect, and RS232 and 
RS422 communication cables. 
(Shrink-Wrap license. Class 1) 

ICE960 base, i960 KA/KB 
probe, 132-pin PGA target 
component interconnect, and 
RS232 and RS422 
communication cables. 
(Shrink-Wrap license. Class 1) 


Order Code Description 

ICE960SBREM Optional 2 MByte Relocatable 
Expansion Memory Board for 
i960 SA/SB components. 

ICE960KBREM Optional 2 MByte Relocatable 
Expansion Memory Board for 
80960KA/KB components. 

PTOI960SB16 Probe and Software to convert 
ICE960KB25 to ICE960SB16. 
An ADPT80EIAJor 
ADPT84PLCC adapter kit 
should also be ordered with 
this package to support the 
component packaging type of 
your choice. (Shrink-Wrap 
license. Class 1) 

PTOI960KB25 Probe and Software to convert 
ICE960SB16Cor 
ICE960SB16J to ICE960KB25. 
(Shrink-Wrap license, Class 1) 
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ICETM.960MC IN-CIRCUIT EMULATOR 



280899-1 


IN-CIRCUIT EMULATOR FOR THE 80960MC 
MICROPROCESSOR 

The ICETM-960MC In-circuit Emulator delivers real-time hardware and software 
debugging capabilities for 80960MC based designs. Features include emulation of the 
80960MC microprocessor, powerful breakpoint specification, fastbreaks, optional 
relocatable expansion memory, two types of trace capability, large trace buffering, 
support of virtual and physical component addressing modes, and sophisticated human 
interface. The ICE-960MC In-circuit Emulator gives you unmatched control over all 
phases of hardware/software debug, including developing, integrating and testing, which 
improves development productivity and speeds time to market. 

FEATURES 


Real-Time Emulation of the 80960MC 
microprocessors up to 20 MHz (25 MHz 
optional) 

Full Symbolic Information Relating to 
Code. Data symbolics subject to some 
limitations in virtual addressing mode 
Optional ICE960KBREM Board Provides 
2 Mb 5 d:es of ICE Memory Which Can 
Overlay User ROM or RAM. 

Zero wait-state operation from user 
memory 

Examine and modify Memory and the 
80960 Registers 


• Breakpoint Capabilities include: 
Execution Address, Instruction Type, 
Bus Read/ Write/ Access, and Data 
Value. Qualification of Events is Based 
on an Occurrence Counter and an 8 state 
State-Machine 

• Hosted on IBM PC AT or compatible 

• Dynamically monitor or update program 
variables or memory during emulation 
with Fastbreaks 

• 1024 Frame Trace Buffer for execution 
and/or Bus Trace and time tags 

• 256 Kbytes of Memory in Standalone 
Self-Test (SAST) Unit 
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ICETM.960MC IN-CIRCUIT EMULATOR 


REAL-TIME EMULATION 

The ICE-960MC In-circuit Emulator provides 
emulation of the 80960MC at speeds up to 20 
MHz (25 MHz optional), thus providing early 
detection of subtle timing problems. Intel’s 
intimate knowledge of the component makes 
possible the tightest conceivable conformance 
between timing parameters of the emulator 
and the target microprocessor. 

PROCESSOR/MEMORY 
EXAMINATION AND 
MODIFICATION 

The 80960MC registers can be accessed 
mnemonically (e.g. gl2, r5, fp3) with the ICE- 
960MC emulator software. Data can be 
displayed or modified in one of four bases 
(hexadecimal, decimal, octal, or binary) and by 
data type (b 5 de, word, etc). Program memory 
contents can be disassembled and displayed as 
80960 assembly instruction mnemonics. 
Additionally, 80960 assembly instruction 
mnemonics can be assembled and stored into 
program memory. 80960MC system data 
structures such as the segment table, dispatch 
port, and page tables can also be accessed and 
modified mnemonically. 

PROGRAM TRACING 

The ICE-960MC emulator can store 1024 
frames of program execution history or 1024 
frames of the 80960MC address/data bus 
activity in the trace buffer. Each frame of 
program execution contains a discontinuity 
address (branch, call, return, etc) and a time- 
tag. This information can be used to 
reconstruct a history of the program execution. 
With the execution trace option enabled, the 
ICE-960MC will run at less than full speed. 
Each trace frame of bus cycles contains one 
complete bus burst trace. Collection of trace 
information is controlled by a logic analyzer 
type moving trace window and by bus access 
type. 

EVENT RECOGNITION 
(BREAKPOINT CONTROL) AND 
EMULATION CONTROL 

ICE-960MC provides comprehensive event 
recognition capabilities including: two 
hardware and thirty-two software breakpoints 
for instruction execution breakpoints, and use 
of the internal debug registers to recognize 
execution of certain instruction types such as 


branch or call instructions. Bus analysis logic 
provides recognition of external bus addresses 
qualified by read, write, or access type as well 
as data values which may be entered as 
masked values. Two synchronization lines are 
provided for recognition of external events. 
ICE-960MC also provides qualification of 
events based on an occurrence counter or by a 
recognition sequence of up to 8 events. Special 
additions for the 80960MC include the ability 
to recognize process binds. Additionally, 
emulation can be automatically stopped when 
the trace buffer is full. Besides the ability to 
execute program code at full speed between 
specified points, the ICE-960MC emulator 
provides the capability to single-step through 
program code. 

RELOCATABLE EXPANSION 
MEMORY 

An optional board provides ICE-960MC with 2 
Mbytes of relocatable expansion memory 
which allows users to develop applications 
either before the target system memory is 
working, or in place of ROM or EPROM to 
speed the debugging cycle. This memory can be 
mapped in two separate 1 Mbyte partitions on 
1 Mbyte boundaries. The memory waitstate 
pattern is (3,1, 1,1) when the user’s system does 
not return RDY # for accesses directed to the 
ICE960KBREM board. For accesses where the 
user system does return RDY # the waitstate 
pattern will be the larger of (3,1, 1,1) or user 
waitstate pattern plus (2,2,2,2). The size and 
shape of the board is identical to the ICE probe 
and is installed between the probe and the 
user’s target system when in use. The memory 
configuration can be mapped via either an ICE 
MAP command or via switches on the 
ICE960KBREM board. 

The ICE-960KBREM card adds some 
constraints when used with the ICE in a user’s 
target system. First, users should qualify bus 
drivers/buffers with DEN # in order to 
eliminate potential bus conflict between 
REM960 and their target memory. Second, the 
1 Mbyte partition size can not be reduced and 
may effect the design of the user’s memory 
subsystem. Third, ICE960KBREM delays the 
ADS# and DEN # signals by 5 nsec (typical) 
and delays the RDY # signal by 2 nsec (typical). 
Fourth, it adds loading, capacitance, and power 
requirements as shown in tables 3 and 4. 


5-26 



inl^. 


ICETM.960MC IN-CIRCUIT EMULATOR 


STANDALONE OPERATION 

Product software can be developed and 
debugged prior to and independent of 
hardware availability with the Standalone Self 
Test unit (SAST), which contains 256 Kbytes of 
two wait-state program memory. The SaS'T 
also provides diagnostic testing to assure full 
functionality of the ICE-960MC emulator. 

VERSATILE AND POWERFUL 
HOSTSOFTWARE 

ICE-960MC provides an easy-to-use human 
interface which utilizes color and pull-down 
menus to complement a powerful command 
set. The software includes: an on-line help 
facility, a dynamic command entry and syntax 
guide, screen oriented editor, assembler and 
disassembler, input/ output redirection, 
command piping, DOS command entry, and 
the ability to customize the command set via 
debug procedures and literal definitions. 


Special software commands are provided to 
display, interpret, and modify the 80960MC 
hardware data structures including the 
segment table, dispatch port, process control 
block, and the page tables and directories. 

DEB UG PROCEDURES AND 
LITERALS 

Debug procedures (PROCs) are user-defined 
groups of ICE-960MC emulator commands. 
They can be stored on disk and recalled during 
later debugging sessions. PROCs can be used to 
simplify the process of debugging by grouping 
repetitive emulator commands, which can then 
be accessed by typing the name of the PROC. 
Literals are user-defined abbreviations for 
whole or partial ICE-960MC emulator 
commands. Literals are a shorthand method of 
customizing the emulator commands to fit 
your needs and preferences. 
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WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, and 
workshops, field application engineering 
expertise, hotline technical support^ and on- 
site service. 

Intel also offers a Software Support package 
which includes technical software information. 


SPECIFICATIONS 


telephone support, automatic distribution of 
software and documentation updates, access to 
the 'ToolTalk” electronic bulletin board, 
^^iComments” publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 

Intel’s Hardware Support package includes 
technical hardware information, telephone 
support, warranty on parts, labor, material, 
and on-site hardware support. 


HOST REQUIREMENTS 

IBM PC AT (minimum requirements) with 640 
KB of conventional memory 

• 1 MB of RAM (Lotus, Intel, Microsoft 
expanded memory specification) 

• 20 MB Fixed Disk 

• At least one S-Y^" Floppy Disk drive 

• A serial interface 

• DOS Operating system (version 3.2 or later 
excluding 4.x) 


REQUIRED SYSTEM 
RESOURCES 

The IGE-960MC emulator requires the 
following: a) exclusive use of the 80960MC’s on- 
chip debug registers and b) a minimum of 256 
b 3 d:es of target system RAM used to flush the 
80960 local registers. 


Mechanical Specifications 


TABLE 1. ICE-960MC Emulator Physical Characteristics 


Unit 

Width 

Inches 

cm 

Height 

Inches 

cm 

Length 
Inches cm 

Weight 
lbs kg 

Control unit 

10.5 

26.7 

1.5 

3.8 

16.0 

40.6 

6.0 

2.72 

Processor module* 

3.8 

9.6 

1.5 

3.8 

5.0 

12.7 



SAST 

6.0 

15.2 

2.0 

5.1 

8.0 

20.3 

3.5 

1.59 

OIB 

3.8 

9.6 

.9 

2.3 

5.1 

13.0 



Power supply 

2.8 

7.1 

4.2 

10.7 

11.0 

27.9 

4.7 

2.14 

User cable 





22.0 

55.9 



Serial cable 





12.0 ft 

3.66m 



'measurement includes target adaptor 
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1 1 

ELECTRICAL SPECIFICATIONS 

SYNC Line Specification 

The SYNCIN line must be valid for at least one 
instruction cycle because it is only sampled on 
instruction boundaries. The SYNCIN line is a 
standard TTL input. The SYNCOUT line is 
driven by a TTL open collector with a 4.75K- 
ohm pull-up resistor. 

AC/DC Specifications 

The Optional Isolation Board (OIB) isolates the 
ICE-960MC probe from an untested user target 
system. When the OIB is in use, the ICE- 
960MC AC and DC specifications differ from 
the 80960MC microprocessor as shown below. 
When the OIB is not installed, the ICE-960MC 
specifications are identical to those of the 
80960MC component. 

TABLE 2. AC Specifications With The OIB Installed 

Symbol* 

Parameter 

Minimum 

Maximum 

t2 

clock low time 

2+lnS 


t3 

clock high time 

3+ Ins 


t6 

output valid delay 




A/D 0:31 

6 + 8ns 

t6+ 16Ns 


DT/R#, DEN#, BEO-3#, ADS#, W/R# 6 + 7nS 

t6 + 14ns 


HLDA, CACHE, LOCK#, INTA# 

6 + 6ns 

t6-l-8nS 


ALE# 

6 + lOnS 

t6-l-20nS 

t7 

ALE# width 

7-6.5nS 


t8 

ALE# disable delay 

8 + nS 

t8 + 14nS 

t9 

output float delay 




A/D 0:31 

t9 + 5nS 

t9 + 22nS 


DT/R# , DEN# , BEO-3 # , ADS# , W/R# t9 -1- 7nS 

t9 + 15ns 


HLDA, CACHE, LOCK#, INTA# 

t9+6nS 

t9 + 8nS 

tio 

input setup 1 


( 


A/D 0:31 

tlO + 2nS 



BADAC#, INTO-3# deassertion 

tlO + 14nS 


til 

input hold 




A/D 0:31, HOLD 

tll + 6nS 



BADAC#, INTO-3#, 




READY# 

tll + 7nS 


tl6 

reset setup time 

16 + 6 


* symbol refers to 80960MC specification 

TABLE 3. ICE-960MC Emulator DC Specifications 

Symbol* 

Parameter 

Maximum 


PM-Icc 

Supply current with 80960KB-20 1400mA 


OIB-Icc 

Supply current 

PM-Icc + 1100mA 


REM-Icc 

Supply current 

PM-Icc + 1300mA (1700 Total Typical) [ 
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TABLE 4. Additional DC Loading 

otgixm 

(without 

OIB installed) 
lih lil 

irxaJkxxxxUxxx 

(with 

OIB installed) 
lih lil 

ivxajkxixx ixxxx xvxci.a.xxxx u.xxx 

(with 

REM installed) 
lih lil 

n.*- • n/r • 

IVXCX JIVXJIAI tXXAX I'tXfXAXMM.M, IXXAX | 

AD (0:31) 

100 uA 

0.6 mA 

20 uA 

— 1 mA 

120 uA 

0.7 mA 

ADS# 

140 uA 

1.6 mA 

20 uA 

— 1 mA 

Driven by 74AS760 

DEN# 

40 uA 

1.0 mA 

20 uA 

-1mA 

w/ 4.7k pull-up 

W/R# 

140 uA 

1.6 mA 

20 uA 

— 1 mA 

150 uA 

1.7 mA 

CLK2 

80 uA 

2.2 mA 

50 uA 

— 2 mA 

130 uA 

2.9 mA 

RESET 



50 uA 

— 2 mA 

250 uA 

0.3 mA 

BE (0:3)# 



20 uA 

— 1mA 

10 uA 

0.1mA 

READY# 



20 uA 

— 1 mA 

750 uA 

0.8 mA 

ALE# 



20 uA 

— 1mA 

20 uA 

0.5 mA 

DT/R# 



20 uA 

— 1 mA 



INTO#, INT3# 


20 uA 

-1mA 



INTI, INT2 



20 uA 

-1mA 



BADAC# 



20 uA 

— 1 mA 



LOCK# 



20 uA 

-1mA 



HOLD 



20 uA 

— 1 mA 



FAILURE# 



20 uA 

— 1 mA 
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SPECIFICATIONS 

POWER SUPPLY 

100- 120V or 220-240V (Selectable) 

50-60 Hz 

2 amps (AC Max) @ 120V 
1 amp (AC Max) @ 240V 

ENVIRONMENTAL 

CHARACTERISTICS 

Operating Temperature 10°C to 40°C 
(50°Ftol04"F) 

Operating Humidity Maximum 85% 

Relative Humidity, 
non-condensing 


ORDERING INFORMATION 

Order Code Description 

ICE960MC The complete 20 MHz ICE- 

960MC emulator system 
including control unit, 
processor module, power 
supply, SAST, OIB, SAB, 
serial communications cable 
(SCOM4), lEDIT, Vl.O 
software. (Requires software 
license. Class I) 

ICE960MC25P 25 MHz ICE960MC as 
described above 

I960MCUPG Conversion kit to convert 
ICE-960KB to ICE-960MC. 
Consists of new host and 
probe software, probe 
firmware, and manual. 
Requires ICE-960KB V2.0 
hardware. 

ICE960KBREM Optional 2 Mbyte Relocatable 
Expansion Memory Board. 
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QT960 EVALUATION AND PROTOTYPING 

BOARD 



270743-1 


LOW COST EVALUATION TOOL 

The QT960 products give you a 32-bit starter kit to begin software evaluation and 
hardware design at a low cost. The boards feature the 20 MHz 80960KB 32-bit embedded 
processor. The 80960KB has integrated floating point, instruction and register caches, 
and an on-chip interrupt controller. The 80960K-series are the first in a new 
architectural family of embedded processors from Intel built using Intel’s CHMOS IV^ 
process. These boards provide you with full access to the features of the 80960KB 
processor. A wire wrap prototyping area offers you easy access to board features to test 
your designs. Interleaved EPROM means fast execution of your code taking advantage of 
the 80960KB’s burst bus. A programmable wait state generator simulates different 
memory environments useful in evaluating the performance of your code. These features 
make the QT960 boards useful low cost tools for the 32-bit embedded designer. 

Once written, you can debug your program with NINDY, an EPROM resident debug 
monitor. NINDY enables you to download code, set seven different trace modes, display 
and modify memory or registers, and disassemble problem code sequences. 

Available separately from Intel are the ASM-960 (assembly language) and iC-960 (high- 
level language) products which provide you with the code development environment for 
the QT960 boards. 

The starter kit comes in two versions: the QT960F version has fast SRAM, high speed 
EPROM and Flash memory; the QT960E version has lower cost SRAM, Flash memory 
and no high speed EPROM. Each version has NINDY in either EPROM (QT960F) or 
Flash memory (QT960E), power supply cable, and the QT960 User Manual. Both versions 
also include the parts list, source code of the debug monitor, and the board data base 
(schematics) all on diskette. Armed with this starter kit you now have a system to 
evaluate and prototype your product ideas quickly and at low cost. 
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FEATURES 


QT960 FEATURES 

• 20 MHz Execution Speed 

• 128K Bytes to Zero Wait State EPROM^ 

• 128K Bytes of Flash Memory 

• 128K B 3 i:es of Zero Wait State SRAM^ 

• Programmable Wait State Generator 

• Prototyping Wire Wrap Area 

• Five Instruction Traces 

• Two Hardware Breakpoints 


• Display/Modify Memory and Registers 

• Code Disassembly 

• High Level Language Support 

• RS-232 Communications Link 

• The QT960E Version has 128K Bytes of Two 
Wait State SRAM and 128K Bytes Four Wait 
State Flash Memory 


Product Order Codes: EVQT960F20 and EVQT960E20 

tCHMOS IV is a patented Intel process. 

:!:QT960F Version only. 


FAST AND EASY CODE UPDATES 

128K Bytes of Intel’s 28F256 Flash memory provides an easy and quick method of changing your 
code in nonvolatile memory. Flash memory may be conveniently reprogrammed without 
removing it from the board while software is under development. 

FASTEPROM 

Interleaved fast EPROM (Intel’s 27C202) on the QT960F version yields one-zero-zero-zero wait 
state code access. It efficiently utilizes the four word burst capabilities of the 80960KB bus 
maximizing program performance. 

PROTOTYPING SUPPORT 

A prototyping wire wrap area is provided on board with access to the system’s signals and buses. 
This area gives you access to the board’s features and allows you to easily test design ideas. A 
system bus connector is also provided for off board prototyping. 

PROGRAMMABLE WAIT STATE GENERATOR 

A software programmable wait state generator enables you to quickly model various memory 
speeds. Under software control you can set over 16 different wait state combinations and evaluate 
the performance of your target system. 

DMA 

The board offers you eight DMA channels accessed through a NINDY library function using 
Intel’s 82380. In addition, off board connectors provide DMA I/O capabilities. 


FIVE INSTRUCTION TRACES AND TWO HARDWARE 
BREAKPOINTS 

NINDY utilizes the built-in trace capabilities of the 80960KB to provide you with single step, 
supervisor, call, return, and branch instruction tracing offering you extensive debug capabilities 
for software examination and modification. Two hardware breakpoints enable you to break on 
and examine EPROM resident code. 
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FEATURES 


HIGH LEVEL LANGUAGE SUPPORT 

NINDY is capable of downloading absolute object code generated by ASM-960 or iC-960. ASM-960 
and iC-960 may be purchased separately from Intel. 

COMMUNICATION AND SOFTWARE REQUIREMENTS 

The QT960 boards communicate with the host through the RS-232 link using an Intel 82510 
UART provided on board. The boards support five baud rates: 1200, 2400, 9600, 19200, and 38400. 
The default is 9600 baud. To communicate with the QT960 boards you must meet the following 
minimum software requirements: 

• Terminal Emulator • XMODEM Download Capabilities 



WIRE 

WRAP 

PINS 


OFF 

BOARD 

CONNECTOR 


270743-2 

Block Diagram of the QT960 Board 

For information or the number of your nearest sales office call 800-548-4752 (U,S, and Canada). 

Intel Corporaton, Literature Department, 3065 Bowers Avenue, Santa Clara CA 95051, United States. Tel: 408-987-8080. 
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DB960CADIC IN-CIRCUIT DEBUG MONITOR 



280900-1 

DB960CADIC 

Intel’s DB960CADIC, the in-circuit debug monitor for the 33 MHz i960CA embedded 
microprocessor, represents a new generation of development tool technology. 

DB960CADIC allows users to debug high-speed, cached applications at the full speed of 
the i960CA target processor. Controlled by Intel’s DB interface, DB960CADIC offers the 
user a tool with a powerful feature set at a fraction of the cost of traditional development 
tools. DB960CADIC is designed to improve productivity by allowing the user to debug 
software before and after the target system arrives, with minimal hardware intrusion. 


Features 

• Real-time emulation of the i960CA 
embedded microprocessor at speeds up to 
33 MHz 

• Full development and debug support for 
i960CA oti-chip cache and RAM 

• Minimal intrusive operation, allowing 
the user to debug the target system with 
minimal modification subject to initial 
design constraints 

• Breakpoint capabilities include ten 
software breakpoints, two hardware 
execution address breakpoints, and two 
hardware data address breakpoints. The 
human interface supplements these 
breakpoints with the ability to break on 
data values, conditions, and a four-state 
state machine in non-real time. 


• Low-Cost 

• Source-Level, Symbolic Debugging in a 
Windowed Human Interface with pull 
down Menus (DB). This interface is 
consistent across i960CA tools. 

• 128K Bytes User Memory 

• Virtual I/O, the ability to perform I/O 
between the DB960CADIC unit and the 
host 

• In-Circuit operation facilitates easy 
transition between target systems 

• Optional Stand-Alone Self-Test 
(DB960CASAST) Module 

• Optional Logic Analyzer Interface Board 
(LAI960CA) 
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DB960CADIC IN-CIRCUIT DEBUG MONITOR 


Full-Speed Debug and 
Development 

The DB960CADIC In-Circuit Debug Monitor 
provides sophisticated real-time hardware and 
software debug capabilities for i960CA 

1 IT 1 • 1. .1 .1 . mi. _ 

ClliUCUUCU llliCX U]Ji uccosaui -UCtCSCU. VlCOlgilO. ilic 

user can run at the full speed of the target 
processor, ensuring that elusive timing bugs 
will be found. The DB960CADIC is jumpered to 
receive a clock pulse from either the user’s 
target system, or from an internal 25 MHz 
clock. 

Ideal for All Stages of 
Development 

DB960CADIC can be used by both hardware 
and software developers, at any stage of design. 
Early in the development process, 
DB960CADIC allows software debugging when 
inserted into an existing i960CA board such as 
the DB960CASAST module or the EV80960CA 
board. Later in the design cycle, DB960CADIC 
can be inserted into the user’s target system, 
thus facilitating debug of hardware/software 
integration. 

Speed Development with Source 
Code, Symbolic Debugging 

Using source code oriented debugging in a 
windowed, symbolic interface, software 
engineers can increase productivity by 
debugging in the medium they are familiar 
with, software source. 

Commands can be entered via either function 
keys, pull-down menus which group logically 
related commands, or a supplementary 
command line which allows entry of complex 
conditions. In addition, source code symbolics 
can be used to examine and modify memory 
and registers. Optimal symbolic debugging can 
be achieved when using DB960CADIC with 
genuine Intel languages. 

Powerful Break Capabilities 

DB960CADIC provides complex emulation 
control by utilizing the on-chip debug registers 
within the i960CA. Real-time break 
capabilities include the ability to break on any 
two execution addresses or data access 


addresses in hardware. Software breakpoints 
are also used to supplement the hardware 
breakpoints for RAM-based memory 
subsystems. DB960CADIC extends these 
capabilities by providing the ability to break 
on data values, NOT data values, or 
combinations of the above in a four-state state 
machine. More complex conditions such as 
breaking when a variable is less than a certain 
value can be entered via a very flexible feature 
called conditional breakpoints. 

128K Bytes User Memory 

DB960CADIC provides the user with 128K 
bytes of memory in Region F of the i960CA 
target space. Since the debug monitor is also 
placed in Region F, the on-chip bus interface 
unit of the i960CA is configured to address 
region F as b 3 d:e-wide memory with 5 
waitstates and no burst accesses allowed. 

Virtual Input/Output 

DB960CADIC is shipped with documented 
library calls which provide users with a built- 
in mechanism of performing target I/O using 
the host system. These libraries provide the 
ability to simulate 1/ O operations in the target 
system before target hardware is available. 

High Speed Serial Link 

Communication between a host and the 
DB960CADIC module is supported via RS232 
and RS422 communication links. RS232 allows 
access to industry standard serial protocols 
while the RS422 link provides a higher speed 
communication mechanism currently 
emerging in the development market. PC/AT 
Compatible RS422 communication boards are 
available from various third party vendors. 

Optional Stand-Alone Self Test 
Chassis 

An optional stand-alone self test chassis 
complements DB960CADIC by allowing the 
user to debug and test code before prototype 
hardware is available. The DB960CASAST 
includes self-test circuitry to ensure that the 
DB960CADIC unit is working correctly. It also 
provides 4 Megabyte of DRAM to be used for 
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developing applications. This memory has a 
(3, 1,1,1) waitstate pattern at 25 MHz. This 
waitstate pattern is programmable using the 
bus controller unit in the i960CA. It also 
includes an 8254 programmable timer which 
can optionally interrupt the i960CA processor 
and provide the ability to time code sequences. 

Optional Logic Analyzer Interface 
Board 

The LAI960CA board provides access to 
i960CA pins by routing the signals to easily 
accessible stake pins while passing them 
through to the target system. 

Software Completes the System 

Intel provides a comprehensive software 
development environment to complement 
DB960CADIC. This environment includes C 
and ASM source languages, a retargetable 
debug monitor, and DB960CADIC. The 
languages support the entire range of 80960 
embedded processors. 


Host System Requirements 

Host system requirements to run the in-circuit 
debugger include the following: 

— DOS version 3.2 or later excluding DOS 4.0 
— 640 bytes of RAM in conventional memory 
— A 20 MB hard disk 
— An RS232 or RS422 Serial Port 
Evaluated Systems include: 

IBM PC-AT* with DOS 3.3 
COMPAQ 386* with DOS 3.3 


Worldwide Service, Support, and 
Training 

To augment its development tools, Intel offers 
a full array of seminars, classes, workshops, 
field application engineering expertise, hotline 
technical support, and omsite service. 

Intel also offers a Software Support contract 
which includes technical software information, 
telephone support, automatic distributions of 
software and documentation updates, 
iCOMMENTS publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 

Intel’s 90-day Hardware Support package 
includes technical hardware information, 
telephone support, warranty on parts, labor, 
material, and on-site hardware support. 

Intel Development Tools also offers a 30-day, 
money-back guarantee to customers who are 
not satisfied after purchasing any Intel 
development tool. 


Intel 301/302* with DOS 3.3 

IBM Personal System/2* Model 70/80 with 
DOS 4.01 

Environment Characteristics 

Operating Temperature: +10°Cto +40®C 
(50"Ftol04°F) 

Operating Humidity: Maximum of 90% 

relative humidity, 
non-condensing. 


DB960CADIC SPECIFICATIONS AND REQUIREMENTS 
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DB960CADIC IN-CIRCUIT DEBUG MONITOR 


DB960CADIC Interface 
Considerations 

Target systems intended to receive 
DB960CADIC must meet the following 
requirements: 

• The target system must not respond to 
memory accesses in Region F (OFOOOOOOO- 
OFFFFFFFF) with DB960CADIC installed. 
DB960CADIC provides an ACTIVE out 
signal which can be used to qualify bus logic 
to prevent this occurrence when 
DB960CADIC is installed. 

• The Target System must provide 1.3 Amps of 
power (worst case) .9 Amps average to power 
the DB960CADIC unit. 

• Use of one of the nine directly accessible 
i960CA interrupts. 

• Use of interrupt table entry 242 or 248. 

• Additional Signal Loading as follows: 

The DB960CADIC makes use of the PCLK 
outputs, DO through D7, and some of the 
address and control signals of the processor. 
The following table lists the worst case 
loadings added by the presence of the 
DB960CADIC circuitry. 


Ordering Information 

DB960CADIC In-circuit debug monitor for 
the i960CA embedded 
microprocessor. Operates at 
speeds up to 33 MHz. Includes 
hardware debug module, 
RS232/RS422 serial cables, 
DOS host software, and 
documentation. 

DB960CASAST Stand-Alone Self Test Unit for 
DB960CADIC. Includes built- 
in power supply, self-test 
board, 4Mbyte of usable 
DRAM for code development, 
and enclosure. 

DB960CAST DB960CADIC and 

DB960CASAST as described 
above. 

LAI960CA Optional Logic Analyzer 

Interface Board for the 
i960CA system. Does not 
require DB960CADIC. 


Signal 

Name 

DC Load 

(f^A) 

Capacitive 
Load (pF) 

PCLKl 

+ 25/ -250 

8 

PCLK2 

+ 30/ -255 

17 

CLKIN 

+ 12/ -12 

13 

D0:D7 

+ 20/ -600 

10 

A31:26 

+ 25/ -250 

11 

A2:A17 

+ 20/ -100 

10 

BEO*, BEl* 

+ 20/ -100 

10 

ADS* 

+ 50/ -500 

13 

W/R* 

>50/ -500 

13 

WAIT* 

+ 25/ -250 

8 

BLAST* 

+ 25/ -250 

8 

FAIL* 

+ 20/ -20 

8 

RESET* 

+ 15/ -15 

25 

INT0:7* 

+ 20/ -500 

15 

NMI* 

+ 20/ -500 

15 


Additional Loading Imposed on the Target by 
the DB960CADIC 
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Intel is committed to providing high quality products and customer support. Our 
commitment to quality is demonstrated by a 30 day, money-back, unconditional refund to 
customers not satisfied with their purchase of an Intel Development Tools product. 

Intel supports its customers by offering a 90-day software warranty and standard 
software support including free technical support over the phone. 

Intel software is continuously undergoing improvement. For customers who desire the 
security of having the most current software and the convenience of having updates sent 
automatically, Intel offers inexpensive Software Support Contracts. 


SOFTWARE WARRANTY 

The standard software warranty is 90 days 
and entitles the customer to the following 
(provided the customer has registered their 
software by returning a completed 
Warranty Registration Card): 

• Replacement of defective media 

• Software product updates occurring 
within the 90 day warranty period. 


STANDARD SOFTWARE 
SUPPORT 

Standard Software Support, provided at no 
additional cost, offers the following 
additional benefits: 

• Free Technical Information Phone 
Service ("TIPS') 

• Timely response to Software Problem 
Reports 
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INTEL DEVELOPMENT TOOLS SOFTWARE SERVICES 


Software Support Contracts 

Software Support Contracts cover products for 
one year from the date of purchase and are 
renewable annually. The following benefits are 
provided: 

• Automatic Software Updates 

• Standard Software Support 


• Remote Diagnostic Software for DOS-based 
products. 

• Monthly issues of iCOMMENTS, a technical 
support publication 

• Quarterly issues of Troubleshooting Guides 
(host-specific) 

• Quantity discounts 


ORDERING INFORMATION 


Ordering Procedures 

For more information, call 1-800-468-3548 or 
your local Intel sales office. Similar support 
offerings are available outside of North 
America. Software Support Contracts are 
available for North American customers only. 

All orders for contracts, including renewals, 
can be submitted through the local Intel sales 
office or directly to the Development Tools 
Operation by calling 1-800-874-6835. 

To order a Software Support Contract, a 
customer must have registered their product 
or provide proof of ownership. Customers must 
also have the most current version of the 
software, otherwise, they must order a product 
upgrade before a support contract may be 
purchased. 

Pricing is a percentage of the List Price, based 
on the number of copies covered by the 
Software Support Contract. For emulators, the 
percentage will be applied to the identified list 
price of the software portion only, not the full 
list price of the emulator. 


Pricing Information 

Quantity discounts are: 
product quantity pricing per copy 

I- 10 copies 20% of List Price 

II- 25 copies 15% of List Price 

26 + copies 10% of List Price 

VAX and MicroVAX software not included. 

Please call 1-800-874-6835 for price quote. 

Ordering Information 

order code description 

SWSUPPORT51 Software Support Contract 
for 51 family 

SWSUPPORT86 Software Support Contract 
for 86 family 

SWSUPPORT96 Software Support Contract 
for 96 family 

SWSUPPORT286 Software Support Contract 
for 286 family 

SWSUPPORT386 Software Support Contract 
for 386 family 

SWSUPPORT486 Software Support Contract 
for 486 family 

SWSUPPORT960 Software Support Contract 
for 960 family 
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iRMKTM 960 
REAL-TIME KERNEL 



■ 32-Bit Real-Time Multi-Tasking Kernel ■ 

for the i960TM Microprocessor Family 

■ Flexible, Modular Design to Ease ■ 

System Integration ^ 

■ Fast Execution with Predictable 

Response Time for Time-Critical ^ 

Applications 

■ Compact Code Size (14 Kbytes— 

Including All Optional Modules) 

The iRMK 960 Real-Time Kernel is the 32-bit real-time executive developed and supported by Intel, the I960 
architecture experts. The kernel is a small, fast and highly modular package of system control software. It 
contains the basic software building blocks that act as the foundation in using the key features of the i960 
microprocessor. The IRMK 960 software Is fully supported by an array of tools that work in the most popular 
development environments (i.e., DOS*, VAX/VMS*, SUN*). 

The IRMK 960 Real-Time Kernel Is available off-the-shelf. The kernel reduces the cost and risk of designing 
and maintaining software for numerous real-time applications such as, embedded control systems and dedi- 
cated real-time subsystems In multiprocessor environment. Use of the kernel can save man years that might 
otherwise be spent developing or porting another real-time kernel. This means reduced time to market for the 
user. 


Requires Only an i960 KA, KB or MC 
Embedded Processor 

Bus Independent 

Easy Customization and Add-On 
Enhancements 

Easily EPROMmable 

Comprehensive Development Tool 
Support 


*DOS® is a registered trademark of Microsoft Corporation. 
VAX/VMSTM is a trademark of Digital Equipment Corporation. 
SUNTM is a trademark of Sun Microsystems. 
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ARCHITECTURAL OVERVIEW 

At the heart of the architecture are the kernel core 
modules consisting of a scheduler, task manager, 
interrupt manager and time manager (See Figure 1). 
As additional building blocks, the kernel provides op- 
tional modules consisting of a mailbox manager, 
semaphore manager, memory manager, on-proces- 
sor interrupt controller manager and fault handler 
manager. The optional device manager for the 
82380 Integrated System Peripheral (ISP) and 8254 
Programmable Interval Timer (PIT) complete the ar- 
chitecture. 


FUNCTIONAL FEATURES 


A Full Set of Real-Time Building Biocks 

The kernel provides a full set of services for real- 
time applications including task management, time 
management, synchronization of and communica- 
tions between tasks, and memory pool manage- 
ment. 


TASK MANAGEMENT 

The iRMK 960 kernel uses system calls to create, 
manage and schedule tasks in a multi-tasking envi- 
ronment. It provides pre-emptive priority scheduling 
combined with optional time-slice (round robin) 
scheduling. 

The scheduling algorithm used by the kernel en- 
ables tasks to be rescheduled in a fixed amount of 
time regardless of the number of tasks. Applications 
may contain any number of tasks. 

An application can integrate optional task handlers 
to customize task management. These handlers can 
execute on task creation, task switch, task deletion 
and task priority change. Task handlers can be used 
for a wide range of functions, including saving and 
restoring the state of coprocessor registers on task 
switch, masking interrupts based on task priority or 
implementing statistical and diagnostic monitors. 

INTERRUPT MANAGEMENT 

iRMK 960 interrupts are managed by immediately 
switching control to user-written interrupt handlers 
when an Interrupt occurs. 
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Figure 1. IRMKtm 960 Real-Time Architecture 
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Response to interrupts is both fast and predictable. 
Most of the kernel’s system calls can be executed 
directly from interrupt handlers. 

TIME MANAGEMENT 

The time management features included in the ker- 
nel provide single-shot alarms, repetitive alarms and 
a real-time clock. In addition, alarms can be reset. 

These time management facilities can solve a wide 
range of real-time programming problems. Single- 
shot alarms, for example, can be used to handle 
timeouts. If the timeout occurs, the alarm invokes a 
user-written handler; if the event occurs before the 
timeout, the application simply deletes the alarm. 
Other uses for the kernel’s time management facili- 
ties including polling devices with repetitive alarms, 
putting tasks to sleep for specified periods of time, 
or implementing a time-of-day clock. 


INTERTASK SYNCHRONIZATION AND 
COMMUNICATION 

Semaphores, regions and mailboxes are the key 
mechanisms the kernel uses for synchronizing tasks 
and communicating between tasks. 

Semaphores are objects used for intertask signaling 
and synchronization. Tasks exchange abstract 
“units” with semaphores as a means of becoming 
synchronized. A task requests a unit from a sema- 
phore to gain access to a resource. If the resource is 
available, the semaphore will have a unit to give to 
the task, enabling the task to proceed. A task sends 
a unit to a semaphore to indicate that it has released 
a previously obtained resource. 

A special binary type of semaphore is called a Re- 
gion. Regions are used to ensure mutual exclusion, 
thus preventing deadlock when tasks contend for 
control of system resources. A task holding a re- 
gion’s unit runs at the priority of the highest priority 
task waiting in queue for the region’s unit. • 

Mailboxes are queues that can hold any number of 
messages and are used to exchange data between 
tasks. Either data or pointers can be sent using mail- 
boxes. The kernel allows mailbox messages to be of 
any length. High priority messages can be placed 
(jammed) at the front of the message queue to en- 
sure that they are received and processed before 
other messages queued at the mailbox. 

To ensure that high priority tasks are not blocked by 
lower priority tasks, the kernel allows tasks to queue 
at semaphores and mailboxes In priority order. The 
kernel also supports first-in, first-out task queueing. 


MEMORY POOL MANAGEMENT 

The iRMK 960 kernel uses the concept of memory 
pools to efficiently divide and manage blocks of 
memory. The memory pool manager provides for 
both fixed and variable block allocation. 

Memory can be divided into any number of pools. 
Multiple memory pools might be created for different 
speed memories, or for allocating different size 
blocks. The times to allocate and de-allocate fixed- 
size areas from within a pool have a fixed upper 
bound. 

The kernel-supplied memory manager works with 
flat memory architecture. Users can also write their 
own memory manager to provide different memory 
management policies or support virtual memory. 


Hardware Requirements and Support 

The kernel requires only an i960 microprocessor and 
sufficient memory for itself and its application. The 
kernel’s design, however, recognizes that many sys- 
tems use additional programmable peripheral devic- 
es and coprocessors. The kernel provides optional 
device managers for: 

• The 82380 Integrated System Peripheral (ISP) 
chip 

• The 8254 Programmable Interval Timer (PIT) chip 

An application can supply managers for other devic- 
es and coprocessors In addition to or In replacement 
of the devices listed above. 

The openness of the iRMK 960 kernel Is a major 
benefit to the OEM. The kernel is designed to be 
programmed into PROM or EPROM, making It easy 
to use In embedded designs. In addition. It can be 
used with any system bus. Including those of MULTI- 
BUS I and MULTIBUS II bus architectures. 


A Modular Architecture for Easy 
Customization 

The kernel is designed for maximum flexibility. It can 
be customized for any application. Each major func- 
tion, mailboxes for example, is implemented as a 
separate module. The kernel’s modules have not 
been linked together and are supplied individually. 
(See Table 1 for the list of kernel modules, and their 
approximate sizes.) 

The user links only the modules needed for his appli- 
cation. Any module not used does not need to be 
linked in, and does not increase the size of the ker- 
nel in your application. The user can also replace 
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any optional kernel module with one that imple- 
ments specific features required by the application. 
For example, the user might want to replace the ker- 
nel’s memory manager with one that supports virtual 
memory. ^ 

Table 1. IRMKtm geo Kernel Modules 
and Approximate Sizes 


Core Modules Bytes 

Task Manager 2600 

Interrupt Manager 150 

Time Manager 3000 

Scheduler 1700 

Initialization 50 

Optional Modules 

Mailbox Manager 1250 

Semaphore Manager 2900 

Memory Manger 1260 

Fault Handler Manager 50 

Miscellaneous 300 

Optional Device Manager 

82380 Integrated System Peripheral 4200 

8254 Programmable Interval Timer 1 200 


Total size of the (entire) kernel (minus device man- 
agers) is about 13.5 Kbytes. 


Developing with the IRMK™ 960 
Real-Time Kernel 

Kernel applications can be written using any lan- 
guage or compiler that produces code that executes 
on \he i960 microprocessor. This independence is 
achieved by using an interface library. This library 
works with the idiosyncracies of a particular lan- 
guage — for example, the ordering of parameters. 
The Interface library translates the calls provided by 
the language into a standard format expected by the 
kernel. Intel provides an interface library for our iC 
960 compiler. The source code of this library is in- 
cluded, so that the user can modify It to support oth- 
er compilers. 

Because the kernel is supplied as unlinked object 
modules, applications can be developed on any sys- 
tem that hosts the development tools needed. 

Comprehensive Development Tool 
Support 

Intel provides a complete line of 80960 development 
tools for writing and debugging IRMK 960 applica- 
tions. 


These tools include: 

Software: ASM 960 assembler IC 960 

compiler 

NOTE: 

These tools are available for DOS, 
VAX/VMS*, MicroVAX*, SUN* and 
EVA960KB 4MB environment 

Debuggers: 

ICETM 960 In-Circuit Emulator for the I960 mi- 
croprocessor 

SMDTM 960 System Debug Monitor for the I960 
microprocessor 

Evaluation 

Vehicles: 

EVA960KB AT Bus-Compatible Board 
A960KB4MB AT Bus-Compatible Board with 
4 Mbytes of Memory 

QT960 Standalone Evaluation Vehicle 


Intel Support, Consulting and Training 

With IRMK 960 kernel software, the developer has 
available the total Intel i960 architecture and real- 
time expertise of Intel’s support engineers. Intel pro- 
vides telephone support, on or off-site consulting, 
troubleshooting guides and updates. The kernel in- 
cludes 90 days of Intel’s Technical Information 
Phone Service (TIPS). Extended support and con- 
sulting are also available. 


Contents of the iRMKTM 960 Kernel 
Development Package 

The iRMK 960 Kernel comes in a comprehensive 

package including: 

• Kernel object modules 

• Source for the kernel supplied 82380 Integrated 
System Peripheral and 8254 PIT device manag- 
ers 

• Source for the iC 960 interface library 

• Source for sample applications showing the fol- 
lowing: 

— Structure of kernel applications 

— Use of the kernel with an application written in iC 
960 language 

— Compile, bind and build sequences 

— Sample initialization code for the I960 microproc- 
essor 

— Applications written to execute in a flat memory 
space 

• User reference guide 

• 90 days of customer support 
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LICENSING 

iRMK 960 software requires prior execution of the 
standard Intel Software License Agreement (SLA). A 
single development copy requires a Class I license 
and allows iRMK 960 software to be loaded and run 
on one single-processor system. 


SPECIFICATIONS 
System Calls 

The following items are system calls arranged by 
type: 

iRMKTM 960 KERNEL SYSTEM CALLS LISTING 

KERNEL INITIALIZATION 

KN initialize Initialize kernel 


OBJECT MANAGEMENT 

KN ^token to ptr 

KN current task 

TASK MANAGEMENT 

KN create task 

KN delete task 

KN suspend task 

KN resume task 

KN set priority 

KN get priority 


Returns a pointer to the 
area holding object 

Returns a pointer for the 
current task 


Creates a task 
Deletes a task 
Suspends a task 
Resumes a task 
Change priority of a task 
Return priority of a task 


KN get time 

KN set time 

KN__tlck 


Get time 
Set time 

Notify kernel that clock 
tick has occurred 


INTERTASK COMMUNICATION AND 
SYNCHRONIZATION 


KN create semaphore 

KN delete__semaphore 

KN send unit 

KN receive unit 

KN create mailbox 

KN delete mailbox 

KN send data 

KN__send__prlority_data 

KN receive data 


Create a semaphore 
Delete a semaphore 

Add a unit to a 
semaphore 

Receive a unit from a 
semaphore 

Create a mailbox 
Delete a mailbox 
Send data to a mailbox 
Place (jam) priority 
message at head of 
message queue 

Request a message 
from a mailbox 


MEMORY MANAGEMENT 


KN create_pool 

KN__delete_pool 
KN create area 


Create a memory pool 
Delete a memory pool 

Create a memory area 
from a pool 


KN delete_area Return a memory area to 

a memory pool 

KN get pool attributes Get a memory pool’s 

attributes 


PROGRAMMABLE INTERRUPT 
CONTROLLER MANAGEMENT 


INTERRUPT MANAGEMENT 


KN set Interrupt 

KN__stop__scheduling 
KN start scheduling 

TIME MANAGEMENT 

KN sleep 

KN create alarm 

KN reset alarm 

KN delete alarm 


Specify interrupt handler 
Suspend task switching 
Resume task switching 


Put calling task to sleep 

Create and start virtual 
alarm clock 

Reset an existing alarm 
Delete alarm 


KN initialize PICs 

KN mask slot 

KN_unmask__slot 

KN_send_EOI 

KN new masks 

KN get slot 

KN get interrupt 


Initialize PIC’s 

Mask out Interrupts on a 
specified slot 

Unmask interrupts on a 
specified slot 

Signal the PIC that the 
Interrupt on a specified 
slot has been serviced 

Change interrupt masks 

Return the most 
Important active interrupt 
slot 

Get address of specified 
interrupt handler 
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PROGRAMMABLE INTERVAL CONTROLLER 

MANAGEMENT 

KN initialize_PIT Initialize the PIT 

KN start__PIT Start PIT counting 

KN geL__PIT__interval Return PIT interval 

PROCESSOR RECOGNIZED FAULT HANDLING 

KN get fault handler Get address of fault 

handler currently 
associated with 
specified fault type 

KN_l_set fault_-handler Establish address of 

fault handler for the 
specified fault type 


PROCESSOR INTERRUPT 
CONTROLLER SUPPORT 

KN get processor Returns value of the 

priority processor 

KN set processor Change the value of the 

priority processor priority 

PERFORMANCE 

The figures listed below were derived from a test 
suite running on a EVA-960 evaluation vehicle using 
an 80960KB running at 20 MHz. The EVA-960 has 
what is known as 2-1 -1-1 wait state memory; what 
this means is that the first instruction of a four in- 
struction fetch takes two wait states, and each of the 
three successive instructions takes one wait state. 
The figures are the worst case values obtained from 
several sets of test runs. The code was generated 
using the iC 960 DOS hosted compiler, Version 1.1. 

Action Time (in jms) 

Create Pool 1 8 

Get Pool Attributes 36 

Delete Pool 1 

Create Area 35 

Delete Area 32 


Action 

Time (in jus) 

Create Semaphore 

6 

Delete Semaphore 

14 

FIFO Semaphore Send Unit 

7 

FIFO Semaphore Receive Unit 

7 

Region Semaphore Send Unit 

18 

Region Semaphore Receive Unit 

14 

Create Mailbox 

19 

Delete Mailbox 

23 

Send Data 

21 

Receive Data 

21 

Create Alarm 

29 

Delete Alarm 

30 

FIFO Semaphore Send/Receive 
Unit with Task Switch 

75 

Suspend Task with Task Switch 

70 

Basic Task Switch 

50 

Create Task 

62 

Suspend Task 

26 

Resume Task 

50 

Delete Task 

50 

Get Priority 

5 

Set Priority 

27 

Set Interrupt 

3 

Get Interrupt 

3 


MANUALS 

iRMK 960 User’s Manual (Intel Order #463863- 
001 ). 

TRAINING INFORMATION 

Intel Customer Service Training: 

“80960 KA/KB Embedded Processor Training 
Course” 

ORDERING INFORMATION 

Ordering Code Product Description 

RMK960 IRMK 960 Real-Time Kernel 


5-48 



EV80960CA Evaluation Board 









Low Cost Processor Evaluation Tool 

Intel’s EV80960CA evaluation board provides a low-cost hardware environment for code 
execution and software debugging. The board features the 80960CA, the newest and 
highest performance member of Intel’s family of 32-bit embedded microprocessors. The 
board allows a user’s program to take full advantage of the power of the 80960CA and 
provides zero wait state execution of the user’s code. 

Popular features such as single line assembler/disassembler, single-step program 
execution and software breakpoints are standard on the EV80960CA’s on-board monitor. 
Available separately, Intel offers a complete code development environment using the 
assembler (ASM-960) as well as high-level languages, such as Intel’s iC-960 C compiler, to 
accelerate development schedules. 

The EV80960CA evaluation board package features the 80960CA System Debug Monitor 
(SDM) in EPROM, a SDM host software floppy disk, a power supply cable, a 9-pin PC/AT 
serial connector for terminal and the EV80960CA User’s Manual. The EV80960CA 
User’s Manual includes schematics of the board, a part list and programmable logic 
(PLD) equations. The board is hosted on an IBM or BIOS-compatible PC/AT. 


*The SRAM memory system provides zero wait state read (O-O-O-O-O) and one wait state write (l-l-l-l-O) performance. 
*The DRAM memory system provides 2-1-1-1-1 reads and writes. 
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EV80960CA Features 

• 25 MHz Execution Speed 

• 32 Kbytes of EPROM for 80960C A SDM 
Target Operating Firmware 

• 64 Kbytes of Zero Wait State Pipelined 
SRAM* 

• 1 Mb 3 d:e of Static-Column Mode DRAM** 
expandable to 4 Mbytes 

• Concurrent Interrogation of Memory and 
Registers 

• Software Breakpoints 

• Code Disassembly 

• High-Level Language Support 

• Two RS-232s for Host and User 
Communication 

• Two iSBX I/O Connectors 

• An Expansion Bus to Accommodate 
Eurocard Form-Factor Prototyping Boards 

Fast Pipelined SRAM Memory 
System 

The pipelined-read memory system of the 
EV80960CA provides true zero wait state read 
and one wait state performance. The memory 
design utilizes the internal wait state 
generator of the 80960CA. 


Fast Static-Column Mode DRAM 

The memory design of the EV80960CA uses 
the 80960CA burst mode bus and static-column 
DRAM mode. The DRAM control PLDs are 
functionally isolated into interconnected state 
machines. The PLDs can be changed to allow 
alternative DRAM memory implementations 
with different DRAM access modes (static- 
column mode, nibble mode or fast-page mode). 

Concurrent Interrogation of 
Memory and Registers 

The 80960CA System Debug Monitor (SDM) for 
the EV80960CA allows the user to read and 
modify internal registers and external memory 
while the user's program is running on the 
board. 

iSBX I/O Connectors and 
Expansion Interface 

The EV80960CA evaluation board has two 
connectors to support both 8- and 16-bit 
standard iSBX Expansion Modules. The board 
also provides an expansion bus to 
accommodate Eurocard form-factor 
prototyping boards. 
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Block Diagram of the EV80960CA Board 
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EV80960CA Evaluation Board 


Communication Link 

The EV80960CA board communicates with the 
host through the RS-232 link using an Intel 
82510 UART provided on board. The board 
supports seven baud rates: 300, 1200, 2400, 
4800, 9600, 19200 and 38400. 

Power Requirements 

The EV80960CA Evaluation Board requires 5V 
at 2000 mA and ± 12V at 25 mA. 


Host System Requirements 

The EV80960CA Evaluation Board is hosted on 
an IBM PC/AT or compatibles; a 386-based PC 
is recommended. The host system must meet 
the following minimum requirements: 

• 512 Kbytes of Memory 

• One 1.2 Mbyte Floppy Disk Drive 

• PC-DOS 3.2 or Later 

• A Serial Port (COMl or COM2) 
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i 960 TM SA/SB EVALUATION BOARD 


The EV80960SX board is a general purpose evaluation tool for the i960TM SA/SB 
embedded processors. This evaluation board provides a high-performance DRAM 
subsystem, an interleaved EPROM subsystem, and a robust set of peripheral devices for 
benchmarking and debugging application code written for the i960 SA/SB embedded 
processors. 

The EV80960SX is a great starter kit for your 32-bit application. The EV80960SX, 
NINDY debug environment, along with assembler and C-compiler (not provided) provide 
a seamless environment for developing code and evaluating the i960 SA/SB processors. 
The NINDY monitor provides code download capabilities from a number of popular 
development systems, including DOS-based PC’s. Single step, breakpoints, register and 
memory display are among the full set of features provided by NINDY. 


The board is provided with the following 
features: 

• DRAM Subsystem operates at 
l-O-O-O-O-O-O-O wait states for read and 
write cycles in the burst mode. The 
DRAM subsystem rubs at the maximum 
processor frequency of 16 MHz, using 
100 ns fast page mode DR AMs. The 
DRAM subsystem can accommodate 
from 512 Kbytes to 4 Mbjrtes, using 4 or 8 
ZIP-packaged DRAMs. 

• Interleaved EPROM Subsystem executes 
burst program fetches with a 2-0-1-0-2-0- 
1-0 wait state performance. 


The EPROM subsystem accommodates 
four, 32-pin or 28-pin 8-bit wide EPROMs 
with up to 150 ns access times. 

• Flash EPROM Subsystem reads and 
writes two 8-bit wide Flash EPROMS. 

• 8259A Interrupt Controller provides 
expanded interrupt capabilities using 
the i960 SA/SB’s interrupt controller 
interface. 

• Parallel Port Input allows fast 
downloads of code or data to the 
EV80960SX board. The parallel port 
provides auto-busy and interrupt 
capabilities, and is a full implementation 
of the Centronics standard. 


ACE51®, ICE® and MCS® are registered trademarks of Intel Corporation. 
Ethernet® is a registered trademark of Xerox Corporation. 

•CHMOS is a patented Intel process. 
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Two serial ports provide queued and 
interrupt driven serial transfer at up to 
128000 baud. 

82C54 Timer/ Counter provides a 32-bit 
counter and 16-bit counter, each with 
dedicated interrupts. 

Expansion /Prototype Bus (XBUS) allows 
expansion cards and prototype hardware 
direct access to the i960 S A/ SB’s bus and 
control signals. Optionally, a configurable 
wait state scheme provides a no glue 
interface to most peripherals attached to the 
XBUS. 

LEDs and Switches are user programmable. 
One 10-segment bar LED, a 7-segment LED 
and an 8-position switch are under program 
control. 

Local Area Networking (LAN) is 
implemented using an 82596SX LAN 
coprocessor. 


• Laser Printer Control provides interfaces to 
TEC or Canon compatible laser engines. 

• Monitor and Self-test diagnostics are 
provided for the EV80960SX in the EPROMs 
installed in the board. 

The evaluation board comes complete with a 
design database included on diskette, the 
NINDY debug monitor on diskette and in 
EPROM, power and serial cables, schematics 
and user’s manual. 

The EV80960SX is a public domain design. The 
hardware is fully documented and provides 
working examples of popular memory and 
peripheral interfaces to the i960 SA/SB 
processor. The schematic and PLD database 
are provided with each board. The EV80960SX 
designs are easily duplicated and can be used 
directly as the building blocks for custom 
designs. Custom hardware can be prototyped 
using the expansion bus (XBUS) connector. 


i960TM 

SB-16 

Processor 


Buffers 



82C54 k 

8259A I 

Timer/ 1 

Interrupt | 

Counter B 

Controller I 


Laser Printer | 
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Interface 
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Centronics 
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NORTH AMERICAN SALES OFFICES 


ALABAMA 

Intel Corp. 

5015 Bradford Dr., #2 
Huntsville 35805 
Tel: (205) 830-4010 
FAX: (205) 837-2640 

ARIZONA ' 

tintel Corp. 

410 North 44th Street 
Suite 500 
Phoenix 85008 
Tel: (602) 231-0386 
FAX; (602) 244-0446 

CALIFORNIA 

tintel Corp. 

21515 Vanowen Street 
Suite 116 

Canoga Park 91303 
Tel: (818) 704-8500 
FAX: (818) 340-1144 

Intel Corp. 

1 Sierra Gate Plaza 
Suite 280C 
Roseville 95678 
Tel: (916) 782-8086 
FAX; (916) 782-8153 

tIntel'Corp. 

9665 Chesapeake Dr. 

Suite 325 
San Diego 92123 
Tel; (619) 292-8086 
FAX: (619) 292-0628 

*tlntel Corp. 

400 N. Tustin Avenue 
Suite 450 
Santa Ana 92705 
Tel; (714) 835-9642 
TWX: 910-595-1114 
FAX: (714) 541-9157 

*tlntel Corp. 

San Tomas 4 

2700 San Tomas Expressway 

2nd Floor 

Santa Clara 95051 

Tel: (408) 986-8086 

TWX: 910-338-0255 

FAX: (408) 727-2620 

COLORADO 

Intel Corp. 

4445 Northpark Drive 
Suite 100 

Colorado Springs 80907 
Tel: (719) 594-6622 
FAX: (303) 594-0720 

*tlntel Corp. 

600 S. Cherry St. 

Suite 700 
Denver 80222 
Tel: (303) 321-8086 
TWX: 910-931-2289 
FAX: (303) 322-8670 

CONNECTICUT 
tintel Corn. 

301 Lee Farm Corporate Park 
83 Wooster Heights Rd. 
Danbury 06810 
Tel: (203) 748-3130 
FAX: (203) 794-0339 

FLORIDA 

tintel Corp, 

800 Fairway Drive 
Suite 160 

Deerfield Beach 33441 
Tel: (305) 421-0506 
FAX: (305) 421-2444 


tintel Corp. 

5850 T.G. Lee Blvd. 

Suite 340 
Orlando 32822 
Tel: (407) 240-8000 
FAX; (407) 240-8097 

GEORGIA 

tintel Corp. 

20 Technology Parkway 
Suite 150 
Norcross 30092 
Tel: (404) 449-0541 
FAX: (404) 605-9762 

ILLINOIS 

‘tintel Corp. 

Woodfield Corp. Center III 
300 N. Martingale Road 
Suite 400 

Schaumburg 60173 
Tel: (708) 605-8031 
FAX: (708) 706-9762 

INDIANA 

tintel Corp. 

8910 Purdue Road 
Suite 350 

Indianapolis 46268 
Tel: (317) 875-0623 
FAX: (317) 875-8938 

MARYLAND 

‘tintel Corp. 

10010 Junction Dr. 

Suite 200 

Annapolis Junction 20701 
Tel: (301) 206-2860 
FAX: (301) 206-3677 
(301) 206-3678 

MASSACHUSETTS 

‘tintel Corp. 

Westford Corp. Center 
3 Carlisle Road 
2nd Floor 
Westford 01886 
Tel: (508) 692-0960 
TWX: 710-343-6333 
FAX: (508) 692-7867 

MICHIGAN 

tintel Corp. 

7071 Orchard Lake Road 
Suite 100 

West Bloomfield 48322 
Tel: (313) 851-8096 
FAX; (313) 851-8770 

MINNESOTA 

tintel Corp. 

3500 W. 80th St. 

Suite 360 

Bloomington 55431 
Tel: (612) 835-6722 
TWX; 910-576-2867 
FAX: (612) 831-6497 

NEW JERSEY 

‘tintel Corp. 

Lincroft Office Center 
125 Half Mile Road 
Red Bank 07701 
Tel: (908) 747-2233 
FAX; (908) 747-0983 

NEW YORK 

‘Intel Corp. 

850 Crosskeys Office Park 
Fairport 14450 
Tel; (716) 425-2750 
TWX: 510-253-7391 
FAX: (716) 223-2561 


‘tintel Corp. 

2950 Express Dr., South 
Suite 130 
Islandia 11722 
Tel: (516) 231-3300 
TWX; 510-227-6236 
FAX: (516) 348-7939 

tintel Corp. 

300 Westage Business Center 

Suite 230 

Fishkill 12524 

Tel: (914) 897-3860 

FAX: (914) 897-3125 

OHIO 

‘tintel Corp. 

3401 Park Center Drive 
Suite 220 
Dayton 45414 
Tel: (513) 890-5350 
TWX: 810-450-2528 
FAX: (513) 890-8658 

‘tintel Corp. 

25700 Science Park Dr. 

Suite 100 
Beach wood 44122 
Tel: (216) 464-2736 
TWX: 810-427-9298 
FAX: (804) 282-0673 

OKLAHOMA 

Intel Corp. 

6801 N. Broadway 
Suite 115 

Oklahoma City 73162 
Tel: (405) 848-8086 
FAX: (405) 840-9819 

OREGON 
tintel Corp. 

15254 N.W. Greenbrier Pkwy. 

Building B 

Beaverton 97006 

Tel: (503) 645-8051 

TWX: 910-467-8741 

FAX: (503) 645-8181 

PENNSYLVANIA 

‘tintel Corp. 

925 Harvest Drive 
Suite 200 
Blue Bell 19422 
Tel: (215) 641-1000 
FAX: (215) 641-0785 

‘tintel Corp. 

400 Penn Center Blvd. 

Suite 610 
Pittsburgh 15235 
Tel: (412) 823-4970 
FAX; (412) 829-7578 

PUERTO RICO 

tintel Corp. 

South Industrial Park 
P.O. Box 910 
Las Piedras 00671 
Tel: (809) 733-8616 

TEXAS 
tintel Corp. 

8911 N. Capital of Texas Hwy. 

Suite 4230 

Austin 78759 

Tel: (512) 794-8086 

FAX: (512) 338-9335 

‘tintel Corp. 

12000 Ford Road 
Suite 400 
Dallas 75234 
Tel: (214) 241-8087 
FAX: (214) 484-1180 


‘tintel Corp. 

7322 S.W. Freeway 
Suite 1490 
Houston 77074 
Tel: (713) 988-8086 
TWX; 910-881-2490 
FAX: (713) 988-3660 

UTAH 

tintel Corp. 

428 East 6400 South 
Suite 104 
Murray 84107 
Tel: (801) 263-8051 
FAX: (801) 268-1457 

WASHINGTON 

tintel Corp, 

155 108th Avenue N.E. 
Suite 386 
Bellevue 98004 
Tel: (206) 453-8086 
TWX: 910-443-3002 
FAX: (206) 451-9556 

Intel Corp. 

408 N. Mullan Road 
Suite 102 
Spokane 99206 
Tel; (509) 928-8086 
FAX: (509) 928-9467 

WISCONSIN 

Intel Corp. 

330 S. Executive Dr. 
Suite 1 02 
Brookfield 53005 
Tel; (414) 784-8087 
FAX: (414) 796-2115 


CANADA 


BRITISH COLUMBIA 

Intel Semiconductor of 
Canada, Ltd. 

4585 Canada Way 
Suite 202 
Burnaby V5G 4L6 
Tel; (604) 298-0387 
FAX: (604) 298-8234 


ONTARIO 

tintel Semiconductor of 
Canada, Ltd. 

2650 Queensview Drive 
Suite 250 
Ottawa K2B 8H6 
Tel: (613) 829-9714 
FAX: (613) 820-5936 

tintel Semiconductor of 
Canada, Ltd. 

190 Attwell Drive 
Suite 500 
Rexdale M9W 6H8 
Tel: (416) 675-2105 
FAX: (416) 675-2438 


QUEBEC 

tintel Semiconductor of 
Canada, Ltd. 

1 Rue Holiday 
Suite 115 
Tour East 
Pt. Claire H9R 5N3 
Tel: (514) 694-9130 
FAX: 514-694-0064 


tSales and Service Office 
‘Field Application Location 




NORTH AMERICAN DISTRIBUTORS 


ALABAMA 

Arrow Electronics, Inc. 

1015 Henderson Road 
Huntsville 35806 
Tel: (205) 837-6955 
FAX: (205) 721-1581 

Hamilton/Avnet Electronics 
4960 Corporate Drive, #135 
Huntsville 35805 
Tel: (205) 837-7210 
FAX: (205) 721-0356 

MTI Systems Sales 
4950 Corporate Drive 
Suite 120 
Huntsville 35805 
Tel: (205) 830-9526 
FAX: (205) 830-9557 

Pioneer/Technologies Group, Inc. 
4835 University Square, #5 
Huntsville 35805 
Tel: (205) 837-9300 
FAX: (205) 837-9358 

ARIZONA 

tArrow Electronics, Inc. 

4134 E. Wood Street 
Phoenix 85040 
Tel: (602) 437-0750 
FAX; (602) 252-9109 

Avnet Computer 
30 South McKemy Avenue 
Chandler 85226 
Tel: (602) 961-6460 
FAX; (602) 961-4787 

Hamilton/Avnet Electronics 
30 South McKemy Avenue 
Chandler 85226 
Tel: (602) 961-6403 
FAX: (602) 961-1331 

Wyle Distribution Group 
4141 E Raymond 
Phoenix 85040 
Tel; (602) 437-2088 
FAX; (602) 437-2124 

CALIFORNIA 

Arrow Commercial System Group 
1502 Crocker Avenue 
Hayward 94544 
Tel: (415) 489-5371 
FAX; (415) 489-9393 

Arrow Commercial System Group 
14242 Chambers Road 
Tustin 92680 
Tel: (714) 544-0200 
FAX: (714) 731-8438 

tArrow Electronics, Inc 
19748 Dearborn Street 
Chatsworth 91311 
Tel: (818) 701-7500 
FAX: (818) 772-8930 

tArrow Electronics, Inc. 

951 1 Ridgehaven Court 
San Diego 92123 
Tel: (619) 565-4800 
FAX: (619) 279-8062 

tArrow Electronics, Inc. 

1180 Murphy Avenue 
San Jose 95131 
Tel: (408) 441-9700 
FAX: (408) 453-4810 

tArrow Electronics, Inc. 

2961 Dow Avenue 
Tustin 92680 
Tel: (714) 838-5422 
FAX; (714) 838-4151 

Avnet Computer 
3170 Pullman Street 
Costa Mesa 92626 
Tel: (714) 641-4121 
FAX: (714) 641-4170 

Avnet Computer 
1 361 B West 190th Street 
Gardena 90248 
Tel: (800) 345-3870 
FAX; (213) 327-5389 


Avnet Computer 
755 Sunrise Blvd., #150 
Roseville 95661 
Tel: (916) 781-2521 
FAX: (916) 781-3819 

Avnet Computer 
1175 Bordeaux Drive, #A 
Sunnyvale 94089 
Tel: (408) 743-3304 
FAX; (408) 743-3348 

Avnet Computer 
21150 Califa Street 
Woodland Hills 91376 
Tel; (808) 345-3870 
FAX: (818) 594-8333 

tHamilton/Avnet Electronics 
3170 Pullman Street 
Costa Mesa 92626 
Tel: (714) 641-4100 
FAX; (714) 754-6033 

tHamilton/Avnet Electronics 
1 1 75 Bordeaux Drive, #A 
Sunnyvale 94089 
Tel: (408) 743-3300 
FAX: (408) 745-6679 

tHamilton/Avnet Electronics 
4545 Viewridge Avenue 
San Diego 92123 
Tel: (619) 571-1900 
FAX; (619) 571-8761 

tHamilton/Avnet Electronics 
21150 Califa St. 

Woodland Hills 91367 
Tel: (818) 594-0403 
FAX: (818) 594-8234 

tHamilton/Avnet Electronics 
1361B West 190th Street 
Gardena 90248 
Tel: (213) 516-8600 
FAX: (213) 217-6822 

tHamilton/Avnet Electronics 
755 Sunrise Avenue, #150 
Roseville 95661 
Tel; (916) 925-2216 
FAX: (916) 925-3478 

Pioneer/Technologies Group, Inc. 

134 Rio Robles 

San Jose 95134 

Tel: (408) 954-9100 

FAX; 408-954-9113 

tWyle Distribution Group 
1 24 Maryland Street 
El Segundo 90245 
Tel; (213) 322-8100 
FAX: (213) 416-1151 

Wyle Distribution Group 
7431 Chapman Ave. 

Garden Grove 92641 
Tel; (714) 891-1717 
FAX: (714) 891-1621 

tWyle Distribution Group 
2951 Sunrise Blvd., Suite 175 
Rancho Cordova 95742 
Tel: (916) 638-5282 
FAX: (916) 638-1491 

tWyle Distribution Group 
9525 Chesapeake Drive 
San Diego 92123 
Tel: (619) 565-9171 
FAX: (619) 365-0512 

tWyle Distribution Group 
3000 Bowers Avenue 
Santa Clara 95051 
Tel: (408) 727-2500 
FAX: (408) 727-5896 

tWyle Distribution Group 
17872 Cowan Avenue 
Irvine 92714 
Tel: (714) 863-9953 
FAX: (714) 263-0473 

tWyle Distribution Group 
26010 Mureau Road, #150 
Calabasas 91302 
Tel: (818) 880-9000 
FAX; (818) 880-5510 


COLORADO 

Arrow Electronics. Inc. 

3254 C Frazer Street 
Aurora 8001 1 
Tel; (303) 373-5616 
FAX: (303) 373-5760 

tHamilton/Avnet Electronics 
9605 Maroon Circle, #200 
Englewood 80112 
Tel: (303) 799-7800 
FAX: (303) 799-7801 

tWyle Distribution Group 
451 E. 1241h Avenue 
Thornton 80241 
Tel: (303) 457-9953 
FAX: (303) 457-4831 


CONNECTICUT 

tArrow Electronics, Inc. 

1 2 Beaumont Road 
Wallingford 06492 
Tel: (203) 265-7741 
FAX: (203) 265-7988 

Avnet Computer 
55 Federal Road, #103 
Danbury 06810 
Tel: (203) 797-2880 
FAX: (203) 791-9050 

tHamilton/Avnet Electronics 
55 Federal Road, #103 
Danbury 06810 
Tel: (203) 743-6077 
FAX: (203) 791-9050 

tPioneer/Standard Electronics 
1 1 2 Main Street 
Norwalk 06851 
Tel: (203) 853-1515 
FAX: (203) 838-9901 


FLORIDA 

tArrow Electronics, Inc. 

400 Fairway Drive, #102 
Deerfield Beach 33441 
Tel: (305) 429-8200 
FAX: (305) 428-3991 

tArrow Electronics, Inc. 

37 Skyline Drive, #3101 
Lake Mary 32746 
Tel: (407) 333-9300 
FAX: (407) 333-9320 

Avnet Computer 

3343 W. Commercial Blvd. 

Bldg. C/D, Suite 107 
Ft. Lauderdale 33309 
Tel: (305) 979-9067 
FAX: (305) 730-0368 

Avnet Computer 
3247 Tech Drive North 
St. Petersburg 33716 
Tel: (813) 573-5524 
FAX: (813) 572-4324 

tHamilton/Avnet Electronics 
5371 N.W. 33rd Avenue 
Ft. Lauderdale 33309 
Tel: (305) 484-5016 
FAX: (305) 484-8369 

tHamilton/Avnet Electronics 
3247 Tech Drive North 
St. Petersburg 33716 
Tel: (813) 573-3930 
FAX; (813) 572-4329 

tHamilton/Avnet Electronics 
7079 University Boulevard 
Winter Park 32791 
Tel; (407) 657-3300 
FAX; (407) 678-1878 

tPioneer/Technologies Group, Inc. 
337 Northlake Blvd., Suite 1000 
Alta Monte Springs 32701 
Tel; (407) 834-9090 
FAX: (407) 834-0865 


Pioneer/Technologies' Group, Inc. 
674 S. Military Trail 
Deerfield Beach 33442 
Tel: (305) 428-8877 
FAX: (305) 481-2950 


GEORGIA 

Arrow Commercial System Group 
3400 C. Corporate Way 
Duluth 30136 
Tel: (404) 623-8825 
FAX: (404) 623-8802 

tArrow Electronic^, Inc. 

4250 E. Rivergreen Pkwy,, #E 
Duluth 30136 
Tel: (404) 497-1300 
FAX: (404) 476-1493 

Avnet Computer 
3425 Corporate Way, #G 
Duluth 30136 
Tel: (404) 623-5452 
F/0(: (404) 476-0125 

Hamilton/Avnet Electronics 
3425 Corporate Way, #G 
Duluth 30136 
Tel: (404) 446-061 1 
FAX; (404) 446-1011 

Pioneer/Technologies Group, Inc. 
4250 C. Rivergreen Parkway 
Duluth 30136 
Tel: (404) 623-1003 
FAX: (404) 623-0665 


ILLINOIS 

tArrow Electronics. Inc. 

1140 W. Thorndale Rd. 

Itasca 60143 
Tel: (708) 250-0500 

Avnet Computer 
1 124 Thorndale Avenue 
Bensenville 60106 
Tel: (708) 860-8573 
FAX; (708) 773-7976 

tHamilton/Avnet Electronics 
1130 Thorndale Avenue 
Bensenville 60106 
Tel: (708) 860-7700 
FAX: (708) 860-8530 

MTI Systems 

1 140 W. Thorndale Avenue 
Itasca 60143 
Tel: (708) 250-8222 
FAX; (708) 250-8275 

tPioneer/Standard Electronics 
2171 Executive Dr., Suite 200 
Addison 60101 
Tel: (708) 495-9680 
FAX: (708) 495-9831 


INDIANA 

tArrow Electronics, Inc. 

7108 Lakeview Parkway West Dr. 
Indianapolis 46268 
Tel: (317) 299-2071 
FAX: (317) 299-2379 

Avnet Computer 
485 Gradle Drive 
Carmel 46032 
Tel. (317) 575-8029 
FAX: (317) 844-4964 , 

Hamilton/Avnet Electronics 
485 Gradle Drive 
Carmel 46032 
Tel; (317) 844-9333 
FAX: (317) 844-5921 

tPioneer/Standard Electronics 
9350 Priority Way West Dr. 
Indianapolis 46250 
Tel: (317) 573-0880 
FAX: (317) 573-0979 


tCertified VAD 
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NORTH AMERICAN DISTRIBUTORS (Contd.) 


IOWA 

Hamilton/Avnet Electronics 
2335A Blairsferry Rd., N.E. 

Cedar Rapids 52402 
Tel; (319) 362-4757 
FAX: (319) 393-7050 

KANSAS 

Arrow Electronics, Inc, 

8208 Melrose Dr., Suite 210 
Lenexa 66214 
Tel: (913) 541-9542 
FAX: (913) 541-0328 

Avnet Computer 
15313 W. 95th Street 
Lenexa 61219 
Tel: (913) 541-7989 
FAX: (913) 541-7904 

fHamilton/Avnet Electronics 
15313 W. 95th 
Overland Park 66215 
Tel: (913) 888-1055 
FAX; (913) 541-7951 

KENTUCKY 

Hamilton/Avnet Electronics 
805 A. Newtown Circle 
Lexington 4051 1 
Tel: (606) 259-1475 
FAX: (606) 252-3238 

MARYLAND 

Arrow Commercial Systems Group 
200 Perry Parkway 
Gaithersburg 20877 
Tel: (301) 670-1600 
FAX; (301) 670-0188 

tArrow Electronics, Inc. 

8300 Guilford Road, #H 
Columbia 21046 
Tel: (301) 995-6002 
FAX: (301) 995-6201 

Avnet Computer 

7172 Columbia Gateway Dr., #G 
Columbia 21045 
Tel: (301) 995-0020 
FAX; (301) 995-3515 

tHamilton/Avnet Electronics 
7172 Columbia Gateway Dr., #F 
Columbia 21045 
Tel; (301) 995-3554 
FAX: (301) 995-3515 

tNorth Atlantic Industries 
Systems Division 
7125 Riverwood Dr. 

Columbia 21046 
Tel: (301) 290-3999 

tPioneer/Technologies Group, Inc. 
15810 Gaither Road 
Gaithersburg 20877 
Tel: (301) 921-0660 
FAX: (301) 670-6746 

MASSACHUSETTS 

Arrow Electronics, Inc. 

25 Upton Dr, 

Wilmington 01887 
Tel: (508) 658-0900 
FAX: (508) 694-1754 

Avnet Computer 
10 D Centennial Drive 
Peabody 01960 
Tel: (508) 532-9886 
FAX; (508) 532-9660 

tHamilton/Avnet Electronics 
10 U centennial Drive 
Peabody 01960 
Tel; (508) 531-7430 
FAX: (508) 532-9802 

tPioneer/Standard Electronics 
44 Hartwell Avenue 
Lexington 02173 
Tel: (617) 861-9200 
FAX: (617) 863-1547 

Wyle Distribution Group 
1 5 Third Avenue 
Burlington 01803 
Tel: (617) 272-7300 
FAX: (617) 272-6809 


MICHIGAN 

tArrow Electronics, Inc. 

19880 Haggerty Road 
Livonia 48152 
Tel: (313) 665-4100 
FAX: (313) 462-2686 

Avnet Computer 
2876 28th Street, S.W., #5 
Grandville 49418 
Tel; (616) 531-9607 
FAX: (616) 531-0059 

Avnet Computer 
41650 Garden Road 
Novi 48375 
Tel: (313) 347-1820 
FAX: (313) 347-4067 

Hamilton/Avnet Electronics 
2876 28th Street, S.W., #5 
Grandville 49418 
Tel; (616) 243-8805 
FAX: (616) 531-0059 

Hamilton/Avnet Electronics 
41650 Garden Brook Rd., #100 
Novi 48375 
Tel: (313) 347-4270 
FAX: (313) 347-4021 

tPioneer/Standard Electronics 
4505 Broadmoor S.E. 

Grand Rapids 49512 
Tel; (616) 698-1800 
FAX: (616) 698-1831 

tPioneer/Standard Electronics 
13485 Stamford 
Livonia 48150 
Tel: (313) 525-1800 
F/\X; (313) 427-3720 

MINNESOTA 

tArrow Electronics, Inc. 

10120A West 76th Street 
Eden Prairie 55344 
Tel: (612) 829-5588 
FAX: (612) 942-7803 

Avnet Computer 
10000 West 76th Street 
Eden Prairie 55344 
Tel: (612) 829-0025 
FAX: (612) 944-2781 

tHamilton/Avnet Electronics 
12400 Whitewater Drive 
Minnetonka 55343 
Tel; (612) 932-0600 
FAX: (612) 932-0613 

tPioneer/Standard Electronics 
7625 Golden Triange Dr., #G 
Eden Prairie 55344 
Tel; (612) 944-3355 
FAX: (612) 944-3794 

MISSOURI 

tArrow Electronics, Inc. 

2380 Schuetz Road 
St. Louis 63141 
Tel; (314) 567-6888 
FAX: (314) 567-1164 

Avnet Computer 
739 Goddard Avenue 
Chesterfield 63005 
Tel: (314) 537-2725 
FAX; (314) 537-4248 

tHamilton/Avnet Electronics 
741 Goddard 
Chesterfield 63005 
Tel; (314) 537-1600 
FAX: (314) 537-4248 

NEW HAMPSHIRE 

Avnet Computer 
2 Executive Park Drive 
Bedford 03102 
Tel: (603) 624-6630 
FAX: (603) 624-2402 

NEW JERSEY 

tArrow Electronics, Inc. 

4 East Stow Road 
Unit 1 1 

Marlton 08053 
Tel; (609) 596-8000 
FAX: (609) 596-9632 


tArrow Electronics, Inc. 

6 Century Drive 
Parsipanny 07054 
Tel; (201) 538-0900 
FAX: (201) 538-4962 

Avnet Computer 
1-B Keystone Ave., Bldg. 36 
Cherry Hill 08003 
Tel: (609) 424-8961 
FAX: (609) 751-2502 

Avnet Computer 
10 Industrial Road 
Fairfield 07006 
Tel: (201) 882-2879 
FAX: (201) 808-9251 

tHamilton/Avnet Electronics 
1 Keystone Ave., Bldg. 36 
Cherry Hill 08003 
Tel: (609) 424-0110 
FAX; (609) 751-2552 

tHamilton/Avnet Electronics 
10 Industrial 
Fairfield 07006 
Tel: (201) 575-3390 
FAX: (201) 575-5839 

tMTI Systems Sales 
6 Century Drive 
Parsippany 07054 
Tel: (201) 539-6496 
FAX: (201) 539-6430 

tPioneer/Standard Electronics 
14- A Madison Rd. 

Fairfield 07006 
Tel: (201) 575-3510 
FAX: (201) 575-3454 


NEW MEXICO 

Alliance Electronics Inc. 
10510 Research Avenue 
Albuquerque 87123 
Tel: (505) 292-3360 
FAX; (505) 275-6392 

Avnet Computer 
7801 Academy Road 
Bldg. 1 , Suite 204 
Albuquerque 87109 
Tel: (505) 828-9725 
FAX: (505) 828-0360 

tHamilton/Avnet Electronics 
7801 Academy Rd. N.E. 
Bldg. 1 , Suite 204 
Albuquerque 87108 
Tel: (505) 765-1500 
FAX: (505) 243-1395 


NEW YORK 

tArrow Electronics, Inc. 

3375 Brighton Henrietta Townline Rd. 
Rochester 14623 
Tel: (716) 427-0300 
FAX; (716) 427-0735 

Arrow Electronics, Inc. 

20 Oser Avenue 
Hauppauge 11788 
Tel: (516) 231-1000 
FAX: (516) 231-1072 

Avnet Computer 
933 Motor Parkway 
Hauppauge 11788 
Tel: (516) 231-9040 
FAX: (516) 434-7426 

Avnet Computer 
2060 Towniine 
Rochester 14623 
Tel: (716) 272-9306 
FAX: (716) 272-9685 

tHamilton/Avnet Electronics 
933 Motor Parkway 
Hauppauge 1 1 788 
Tel: (516) 231-9800 
FAX: (516) 434-7426 

tHamilton/Avnet Electronics 
2060 Townline Rd. 

Rochester 14623 
Tel: (716) 292-0730 
FAX; (716) 292-0810 


Hamilton/Avnet Electronics 
103 Twin Oaks Drive 
Syracuse 13120 
Tel; (315) 437-2641 
FAX: (315) 432-0740 

MTI Systems 
50 Horseblock Road 
Brookhaven 11719 
Tel: (516) 924-9400 
F/U(: (516) 924-1103 

MTI Systems 
1 Penn Plaza 
250 W. 34th Street 
New York 10119 
Tel; (212) 643-1280 
FAX: (212) 643-1288 

Pioneer/Standard Electronics 
68 Corporate Drive 
Binghamton 13904 
Tel: (607) 722-9300 
FAX: (607) 722-9562 

tPioneer/Standard Electronics 
60 Crossway Park West 
Woodbury, Long Island 11797 
Tel: (516) 921-8700 
FAX: (516) 921-2143 

tPioneer/Standard Electronics 
840 Fairport Park 
Fairport 14450 
Tel: (716) 381-7070 
FAX: (716) 381-5955 


NORTH CAROLINA 

tArrow Electronics, Inc. 

5240 Greensdairy Road 
Raleigh 27604 
Tel: (919) 876-3132 
FAX: (919) 878-9517 

Avnet Computer 
2725 Millbrook Rd., #123 
Raleigh 27604 
Tel: (919) 790-1735 
FAX: (919) 872-4972 

Hamilton/Avnet Electronics 
5250-77 Center Dr. #350 
Charlotte 28217 
Tel: (704) 527-2485 
FAX: (704) 527-8058 

tHamilton/Avnet Electronics 
3510 Spring Forest Drive 
Raleigh 27604 
Tel: (919) 878-0819 

Pioneer/Technologies Group, Inc. 
9401 L-Southern,Pine Blvd. 
Charlotte 28210 
Tel: (704) 527-8188 
FAX: (704) 522-8564 

Pioneer Technologies Group, Inc. 
2810 Meridian Parkway, #148 
Durham 27713 
Tel: (919) 544-5400 
FAX; (919) 544-5885 


OHIO 

Arrow Commercial System Group 
284 Cramer Creek Court 
Dublin 43017 
Tel: (614) 889-9347 
FAX: (614) 889-9680 

tArrow Electronics, Inc. 

6573 Cochran Road, #E 
Solon 44139 
Tel: (216) 248-3990 
FAX; (216) 248-1106 

Arrow Electronics, Inc. 

8200 Washington Village Dr. 
Centerville 45458 
Tel: (513) 435-5563 
FAX: (513) 435-2049 


tCertified VAD 



intJ. 

NORTH AMERICAN DISTRIBUTORS (Contd.) 


OHIO (Contd.) 

Avnet Computer 
7764 Washington Village Dr. 
Dayton 45459 
Tel: (513) 439-6756 
FAX: (513) 439-6719 

Avnet Computer 

30325 Bainbridge Rd., Bldg. A 

Solon 44139 

Tel: (216) 349-2505 

FAX: (216) 349-1894 

tHamilton/Avnet Electronics 
7760 Washington Village Dr. 
Dayton 45459 
Tel: (513) 439-6733 
FAX: (513) 439-6711 

tHamilton/Avnet Electronics 
30325 Bainbridge 
Solon 44139 
Tel: (800) 543-2984 
FAX: (216) 349-1894 

Hamilton/Avnet Electronics 
2600 Corp Exchange Drive, #180 
Columbus 43231 
Tel: (614) 882-7004 
FAX: (614) 882-8650 

MTI Systems Sales 
23404 Commerce Park Road 
Beachwood 441 22 
Tel: (216) 464-6688 
FAX: (216) 464-3564 

fPioneer/Standard Electronics 
4433 Interpoint Boulevard 
Dayton 45424 
Tel: (513) 236-9900 
FAX: (513) 236-8133 

tPioneer/Standard Electronics 
4800 E. 131st Street 
Cleveland 44105 
Tel: (216) 587-3600 
FAX: (216) 663-1004 

OKLAHOMA 

Arrow Electronics, Inc. 

12111 East 51st Street, #101 
Tulsa 74146 
Tel: (918) 252-7537 
FAX: (918) 254-0917 

tHamilton/Avnet Electronics 
12121 E. 51st St., Suite 102A 
Tulsa 74146 
Tel: (918) 664-0444 
FAK: (918) 250-8763 

OREGON 

tAlmac Electronics Corp. 

1885 N.W. 169th Place 
Beaverton 97006 
Tel: (503) 629-8090 
F/\X: 503-645-061 1 

Avnet Computer 

9409 Southwest Nimbus Ave. 

Beaverton 97005 

Tel: (503) 627-0900 

FAX: (503) 526-6242 

tHamilton/Avnet Electronics 
9409 S.W. Nimbus Ave. 

Beaverton 97005 
Tel: (503) 627-0201 
FAX: (503) 641-4012 

Wyle 

9640 Sunshine Court 
Bldg. G, Suite 200 
Beaverton 97005 ■ 

Tel: (503) 643-7900 
FAX: (503) 646-5466 

PENNSYLVANIA 

Avnet Computer 

213 Executive Drive, #320 

Mars 16046 

Tel: (412) 772-1888 

FAX: (412) 772-1890 

Hamilton/Avnet Electronics 
213 Executive, #320 
Mars 16045 
Tel: (412) 281-4152 
FAX: (412) 772-1890 


Pioneer/Technologies Group, Inc. 
259 Kappa Drive 
Pittsburgh 15238 
Tel: (412) 782-2300 
FAX: (412) 963-8255 

tPioneer/Technologies Group, Inc. 

500 Enterprise Road 

Keith Valley Business Center 

Horsham 19044 

Tel: (215) 674-4000 

FAX: (215) 674-3107 

TENNESSEE 

Arrow Commercial System Group 
3635 Knight Road, #7 
Memphis 38118 
Tel: (901) 367-0540 
FAX: (901) 367-2081 

TEXAS 

Arrow Electronics, Inc. 

3220 Commander Drive 
Carrollton 75006 
Tel: (214) 380-6464 
FAX: (214) 248-7208 

Avnet Computer 
4004 Beltline, Suite 200 
Dallas 75244 
Tel: (214) 308-8181 
FAX: (214) 308-8129 

Avnet Computer 

1235 North Loop West, #525 

Houston 77008 

Tel: (713) 867-7500 

FAX: (713) 861-6851 

tHamilton/Avnet Electronics 
1 826-F Kramer Lane 
Austin 78758 
Tel: (800) 772-5668 
FAX: (512) 832-4315 

tHamilton/Avnet Electronics 
4004 Beltline, #200 
Dallas 75244 
Tel: (214) 308-8111 
FAX: (214) 308-8109 

tHamilton/Avnet Electronics 
1235 N. Loop West, #521 
Houston 77008 
Tel: (713) 240-7733 
FAX: (713) 861-6541 

tPioneer/Standard Electronics 
1 826-D Kramer Lane 
Austin 78758 
Tel: (512) 835-4000 
FAX: (512) 835-9829 

tPioneer/Standard Electronics 

13765 Beta Road 

Dallas 75244 

Tel: (214) 386-7300 

FAX: (214) 490-6419 

tPioneer/Standard Electronics 
10530 Rockley Road, #100 
Houston 77099 
Tel: (713) 495-4700 
FAX: (713) 495-5642 

tWyle Distribution Group 
1810 Greenville Avenue 
Richardson 75081 
Tel: (214) 235-9953 
FAX: (214) 644-5064 

Wyle Distribution Group 
4030 West Braker Lane, #330 
Austin 78758 
Tel: (512) 345-8853 
FM: (512) 345-9330 

Wyle Distribution Group 
11001 South Wilcrest, #100 
Houston 77099 
Tel: (713) 879-9953 
FAX: (713) 879-6540 

UTAH 

Arrow Electronics, Inc. 

1946 W. Parkway Blvd. 

Salt Lake City 84119 
Tel: (801) 973-6913 


Avnet Computer 
1100 E. 6600 South, #150 
Salt Lake City 84121 
Tel: (801) 266-1115 
FAX: (801) 266-0362 

Avnet Computer 
17761 Northeast 78th Place 
Redmond 98052 
Tel: (206) 867-0160 
FAX: (206) 867-0161 

tHamilton/Avnet Electronics 
1100 East 6600 South, #120 
Salt Lake City 84121 
Tel: (801) 972-2800 
FAX: (801) 263-0104 

tWyle Distribution Group 
1325 West 2200 South, #E 
West Valley 84119 
Tel: (801) 974-9953 
FAX: (801) 972-2524 

WASHINGTON 

tAlmac Electronics Corp. 

14360 S.E. Eastgate Way 
Bellevue 98007 
Tel: (206) 643-9992 
FAX: (206) 643-9709 

tHamilton/Avnet Electronics 
17761 N.E. 78th Place, #C 
Redmond 98052 
Tel: (206) 241-8555 
FAX: (206) 241-5472 

Wyle Distribution Group 
15385 N.E. 90th Street 
Redmond 98052 
Tel: (206) 881-1150 
FAK: (206) 881-1567 

WISCONSIN 

Arrow Electronics, Inc. 

200 N. Patrick Blvd., Ste. 100 
Brookfield 53005 
Tel: (414) 792-0150 
FAX: (414) 792-0156 

Avnet Computer 

20875 Crossroads Circle, #400 

Waukesha 531 86 

Tel: (414) 784-8205 

FAX: (414) 784-6006 

tHamilton/Avnet Electronics 
28875 Crossroads Circle, #400 
Waukesha 53186 
Tel: (414) 784-4510 
FAX: (414) 784-9509 

Pioneer/Standard Electronics 
120 Bishops Way #163 
Brookfield 53005 
Tel: (414) 784-3480 

ALASKA 

Avfiet Computer 
1400 West Benson Blvd. 

Suite 400 
Anchorage 99503 
Tel: (907) 274-9899 
FAX: (907) 277-2639 


CANADA 

ALBERTA 

Avnet Computer 
2816 21st Street Northeast 
Calgary T2E 6Z2 
Tel: (403) 291-3284 
FAX: (403) 250-1591 

Zentronics 

6815 8th Street N.E., #100 
Calgary T2E 7H 
Tel: (403) 295-8838 
FAX: (403) 295-8714 

BRITISH COLUMBIA 

tHamilton/Avnet Electronics 
861 0 Commerce Court 
Burnaby V5A 4N6 
Tel: (604) 420-4101 
FAX: (604) 420-5376 


Zentronics 

11400 Bridgeport Rd., #108 
Richmond V6X 1T2 
Tel: (604) 273-5575 
FAX: (604) 273-2413 


ONTARIO 

Arrow Electronics, Inc 
36 Antares Dr., Unit 100 
Nepean K2E 7W5 
Tel: (613) 226-6903 
FAX: (613) 723-2018 

tArrow Electronics, Inc. 

1093 Meyerside, Unit 2 
Mississauga L5T 1M4 
Tel: (416) 670-7769 
F/\X: (416) 670-7781 

Avnet Computer 

Canada System Engineering 

Group 

3688 Nashua Dr., Unit 6 
Mississuaga L4V 1M5 
Tel: (416) 672-8638 
FAX: (416) 677-5091 

Avnet Computer 
6845 Rexwood Road 
Units 7-9 

Mississuaga L4V 1M4 
Tel: (416) 672-8638 
FAX: (416) 672-8650 

Avnet Computer 
190 Colonade Road 
Nepean K2E 7J5 
Tel: (613) 727-7529 
FAX: (613) 226-1184 

tHamilton/Avnet Electronics 
6845 Rexwood Rd., Units 3-5 
Mississauga L4T 1R2 
Tel: (416) 677-7432 
FAX: (416) 677-0940 

tHamilton/Avnet Electronics 
190 Colonade Road 
Nepean K2E 7J5 
Tel: (613) 226-1700 
FAX: (613) 226-1184 

tZentronics 
1355 Meyerside Drive 
Mississauga L5T 1C9 
Tel: (416) 564-9600 
FAX: (416) 564-3127 

tZentronics 

155 Colonade Rd., South 
Unit 17 

Nepean K2E 7K1 
Tel: (613) 226-8840 
FAX: (613) 226-6352 


QUEBEC 

Arrow Electronics Inc. 

1100 St. Regis Blvd. 

Dorval H9P 2T5 
Tel: (514) 421-7411 
FAX: (514) 421-7430 

Arrow Electronics, Inc. 

500 Boul. St-Jean-Baptiste Ave. 
Quebec H2E 5R9 
Tel: (418) 871-7500 
FAX: (418) 871-6816 

Avne! Computer 
2795 Rue Halpern 
St. Laurent H4S 1P8 
Tel: (514) 335-2483 
FAX: (514) 335-2481 

tHamilton/Avnet Electronics 
2795 Halpern 
St. Laurent H4S 1 P8 
Tel: (514) 335-1000 
FAX: (514) 335-2481 

tZentronics 
520 McCaffrey 
St. Laurent H4T 1N3 
Tel: (514) 737-9700 
FAX: (514) 737-5212 


tCertified VAD 




EUROPEAN SALES OFFICES 


FINLAND 


GERMANY 


ITALY 


SPAIN 


UNITED KINGDOM 


Intel Finland OY 
Ruosilantie 2 
00390 Helsinki 
Tel. (358) 0 544 644 
FAX; (358) 0 544 030 

FRANCE 

Intel Corporation S.A.R.L. 

1 , Rue Edison-BP 303 
78054 St. Quentin-en-Yvelines 
Cedex 

Tel: (33) (1) 30 57 70 00 
FAX: (33) (1) 30 64 60 32 


Intel GmbH 

Dornacher Strasse 1 

8016 Feldkirchen bei Muenchen 

Tel: (49) 089/90992-0 

FAX: (49) 089/9043948 

ISRAEL 

Intel Semiconductor Ltd. 

Atidim Industrial Park-Neve Sharet 

P.O. Box 43202 

Tel-Aviv 61430 

Tel: (972) 03 498080 

FAX: (972) 03 491870 


Intel Corporation Italia S.p.A. 
Milanofiori Palazzo E 
20094 Assago 
Milano 

Tel; (39) (02) 89200950 
FAX: (39) (2) 3498464 

NETHERLANDS 

Intel Semiconductor B.V. 
Postbus 84130 
3009 CC Rotterdam 
Tel; (31) 10 407 11 11 
FAX: (31) 10 455 4688 


Intel Iberia S.A. 
Zubaran, 28 
28010 Madrid 
Tel; (34) 308 25 52 
FAX: (34) 410 7570 

SWEDEN 

Intel Sweden A.B. 
Dalvagen 24 
171 36 Solna 
Tel; (46) 8 734 01 00 
FAX: (46) 8 278085 


Intel Corporation (U.K.) Ltd. 
Pipers Way 

Swindon, Wiltshire SN3 1RJ 
Tel: (44) (0793) 696000 
FAX: (44) (0793) 641440 


EUROPEAN DISTRIBUTORS/REPRESENTATIVES 


AUSTRIA 

Bacher Electronics GmbH 
Rotenmuehlgasse 26 
A-1 120 Wien 
Tel: 43 222 81356460 
FAX: 43 222 834276 

BELGIUM 

Inelco Belgium S.A. 
Oorlogskruisenlaan 94 
B-1120 Bruxelles 
Tel: 32 2 244 281 1 
FAX: 32 2 216 4301 

FRANCE 

Almex 

48, Rue de I'Aubepine 
B.P. 102 

92164 Antony Cedex 
Tel: 33 1 4096 5400 
FAX: 33 1 4666 6028, 

Lex Electronics 
Silic 585 

60 Rue des Gemeaux 
94663 Rungis Cedex 
Tel; 33 1 4978 4978 
FAX; 33 1 4978 0596 

Metrologie 
Tour d’Asnieres 
4, Avenue Laurent Cely 
92606 Asnieres Cedex 
Tel: 33 1 4790 6240 
FAX: 33 1 4790 5947 

Tekelec-Airtronic 
Cite Des Bruyeres 
Rue Carle Vernet 
BP 2 

92310 Sevres 
Tel: 33 1 4623 2425 
FAX; 33 1 4507 2191 

GERMANY 

E2000 Vertriebs-AG 
Stahlgruberring 12 
8000 Muenchen 82 
Tel: 49 89 420010 
FAX: 49 89 42001209 

Jermyn GmbH 
Im Dachsstueck 9 
6250 Limburg 
Tel: 49 6431 5080 
FAX: 49 6431 508289 

Metrologie GmbH 
Steinerstrasse 15 
8000 Muenchen 70 
Tel: 49 89 724470 
FAX: 49 89 724471 1 1 


Proelectron Vertriebs GmbH 
Max-Planck-Strasse 1-3 
6072 Dreieich 
Tel: 49 6103 304343 
FAX: 49 6103 304425 

Rein Electronik GmbH 
Loetscher Weg 66 
4054 Nettetal 1 
Tel: 49 2153 7330 
FAX: 49 2153 733513 


GREECE 

Pouliadis Associates Corp. 
5 Koumbari Street 
Kolonaki Square 
10674 Athens 
Tel: 30 1 360 3741 
FAX: 30 1 360 7501 


IRELAND 

Micro Marketing 
Tany Hall 
Eglinton Terrace 
Dundrum 
Dublin 

Tel: 0001 989 400 
FAX: 0001 989 8282 


ISRAEL 

Eastronics Ltd. 
Rozanis 1 1 
P.O.B. 39300 
Tel Baruch 
Tel-Aviv 61392 
Tel: 972 3 475151 
FAX: 972 3 475125 


ITALY 

Celdis Spa 
Via F.lli Gracchi 36 
20092 Cinisello Balsamo 
Milano 

Tel: 39 2 66012003 
F/0<; 39 2 6182433 

Intesi Div. Della Deutsche 
Divisione ITT 
Industries GmbH 
P.l. 06550110156 
Milanofiori Palazzo E5 
20094 Assago (Milano) 
Tel; 39 2 824701 
FAX: 39 2 8242631 


Lasi Eleltronica S.p A. 

P.l. 00839000155 
Viale FuMo Testi, N.280 
20126 Milano 
Tel: 39 2 66101370 
FAX: 39 2 66101385 

Telcom s.r.l.- Divisione MDS 

Via Trombetta 

Zona Marconi 

Strada Cassanese 

Segrate- Milano 

Tel: 39 2 2138010 

FAX: 39 2 216061 


NETHERLANDS 

Koning en Hartman B.V. 
Energieweg 1 
2627 AP Delft 
Tel: 31 15 609 906 
FAX: 3T 15 619 194 


PORTUGAL 

ATD Etectronica LDA 
Rua Dr. Faria de 
Vasconcelos, 3a 
1900 Lisboa 
Tel: 351 1 8472200 
FAX: 351 1 8472197 


SPAIN 

ATD Electronica 
Plaza Ciudad de Viena, 6 
28040 Madrid 
Tel: 34 1 534 4000/09 
FAX: 34 1 534 7663 

Metrologia Iberica 
Clra De Fuencarral N.80 
28100 Alcobendas 
Madrid 

Tel: 34 1 6538611 
FAX: 34 1 6517549 


SCANDINAVIA 

OY Fintronic AB 
Heikkilantie 2a 
SF-00210 Helsinki 
Tel; 358 0 6926022 
FAX: 358 0 6821251 


ITT Multikomponent A/S 
Naverland 29 
DK-2600 Glostrup 
Denmark 

Tel: 010 45 42 451822 
FAX: 010 45 42 457624 

Nordisk Elektronik NS 
Postboks 122 
Smedsvingen 4 
N-1364 Hvalstad 
Nonway 

Tel: 47 2 846210 
FAX; 47 2 846545 

Nordisk Electronik AB 
Box 36 

Torshamnsgatan 39 
S- 16493 Kista 
Sweden 

Tel: 46 8 7034630 
FAX; 46 8 7039845 


SWITZERLAND 

Industrade A.G. 
Hertistrasse 31 
CH-8304 Wallisellen 
Tel; 41 1 83281 1 1 
FAX; 41 1 8307550 


TURKEY 

EMPA 

80050 Sishane 

Refik Saydam Cad No. 89/5 

Istanbul 

Tel: 90 1 143 6212 
FAX: 90 1 143 6547 


UNITED KINGDOM 

Access Elect Comp Ltd. 
Jubilee House 
Jubilee Road , 

Letchworth 
Hertfordshire 
SG6 1QH 
Tel: 0462 480888 
FAX: 0462 682467 

Bytech Components Ltd. 
1 2a Cedarwood 
Chineham Business Park 
Crockford Lane 
Basingstoke 
Hants RG12 1RW 
Tel: 0256 707107 
FAX: 0256 707162 


Bytech Systems 
Units 

The Western Centre 
Western Road 
Bracknell 
Berks RG12 1RW 
Tel: 0344 55333 
FAX: 0344 867270 

Metrologie 
Rapid House 
Oxford Road 
High Wycombe 
Bucks 

Herts HP11 2EE 
Tel: 0494 474147 
FAX: 0494 452144 

Jermyn 
Vestry Estate 
Otford Road 
Sevenoaks 
Kent TN14 5EU 
Tel; 0732 450144 
FAX; 0732 451251 

MMD 

3 Bennet Court 
Bennet Road 
Reading 

Berkshire RG2 OQX 
Tel: 0734 313232 
FAX: 0734 313255 

Rapid Silicon 
3 Bennet Court 
Bennet Road 
Reading 
Berks RG2 OQX 
Tel: 0734 752266 
FAX: 0734 312728 

Metro Systems 
Rapid House 
Oxford Road 
High Wycombe 
Bucks HP11 2EE 
Tel; 0494 474171 
FAX: 0494 21860 


YUGOSLAVIA 

H.R. Microelectronics Corp. 
2005 de la Cruz Blvd. 

Suite 220 

Santa Clara, CA 95050 
U.S.A. 

Tel: (408) 988-0286 
FAX: (408) 988-0306 




INTERNATIONAL SALES OFFICES 


AUSTRALIA 

Intel Australia Pty. Ltd. 

Unit 13 

Allambie Grove Business Park 
25 Frenchs Forest Road East 
Frenchs Forest, NSW. 2086 
Sydney 

Tel: 61-2-975-3300 
FAX: 61-2-975-3375 

Intel Australia Pty. Ltd. 

711 High Street 
1st Floor 

East K\w. Vic., 3102 
Melbourne 
Tel: 61-3-810-2141 
FAX: 61-3-819 7200 

BRAZIL 

Intel Semiconductores do Brazil LTDA 
Avenida Paulista, 1159-CJS 404/405 
01311 - Sao Paulo - S.P. 

Tel: 55-11-287-5899 
TLX: 11-37-557-ISDB 
FAX: 55-11-287-5119 

, CHINA/HONG KONG 


Intel Semiconductor Ltd.* 
10/F East Tower 
Bond Center 
Queensway, Central 
Hong Kong 
Tel: (852) 844-4555 
FAX: (852) 868-1989 


INDIA 

Intel Asia Electronics, Inc 

4/2, Samrah Plaza 

St. Mark's Road 

Bangalore 560001 

Tel: 91-812-215773 

TLX: 953-845-2646 INTEL IN 

FAX: 091-812-215067 


JAPAN 

Intel Japan K.K. 

5-6 Tokodai, Tsukuba-shi 
Ibaraki, 300-26 
Tel: 0298-47-8511 
FAX: 0298-47-8450 


Intel Japan K.K.* 

Bldg. Kumagaya 
2-69 Hon-cho 

Kumagaya-shi, Saitama 360 
Tel: 0485-24-6871 
FAX: 0485-24-7518 

Intel Japan K.K.* 

Kawa-asa Bldg. 

2-11-5 Shin-Yokohama 
Kohoku-ku, Yokohama-shi 
Kanagawa, 222 
Tel: 045-474-7661 
FAX: 045-471-4394 

Intel Japan K.K.* 
Ryokuchi-Eki Bldg. 

2-4-1 Terauchi 
Toyonaka-shi, Osaka 560 
Tel: 06-863-1091 
FAX: 06-863-1084 

Intel Japan K.K. 

Shinmaru Bldg. 

1-5-1 Marunouchi 
Chiyoda-ku, Tokyo 100 
Tel: 03-3201-3621 
FAX: 03-3201-6850 


Intel PRC Corporation 
15/F, Office 1, Citic Bldg. 
Jian Guo Men Wai Street 
Beijing, PRC 
Tel: (1) 500-4850 
TLX: 22947 INTEL CN 
FAX: (1) 500-2953 


Intel Japan K.K * 
Hachioji ON Bldg. 
4-7-14 Myojin-machi 
Hachioji-shi, Tokyo 192 
Tel: 0426-48-8770 
FAX: 0426-48-8775 


Intel Japan K.K. 
Green Bldg. 

1-16-20 Nishiki 
Naka-ku, Nagoya-shi 
Aichi 460 
Tel: 052-204-1261 
FAX: 052-204-1285 


KOREA 


Intel Korea, Ltd. 

16th Floor, Life Bldg 

61 Yoido-dong, Youngdeungpo-Ku 

Seoul 150-010 

Tel: (2) 784-8186 

FAX: (2) 784-8096 


SINGAPORE 


Intel Singapore Technology, Ltd. 

101 Thomson Road #08-03/06 

United Square 

Singapore 1130 

Tel: (65) 250-781 1 

FAX- (65) 250-9256 


TAIWAN 


Intel Technology Far East Ltd. 
Taiwan Branch Office 
8th Floor, No. 205 
Bank Tower Bldg. 

Tung Hua N. Road 
Taipei 

Tel. 886-2-5144202 
FAX: 886-2-717-2455 


INTERNATIONAL DISTRIBUTORS/REPRESENTATIVES 


ARGENTINA 

Dafsys S.R.L. 
Chacabuco, 90-6 Piso 
1069-Buenos Aires 
Tel: 54-1-34-7726 
FAX: 54-1-34-1871 


AUSTRALIA 

Email Electronics 
15-17 Hume Street 
Huntingdale, 3166 
Tel: 011-61-3-544-8244 
TLX: AA 30895 
FAX: 011-61-3-543-8179 

NSD-Australia 
205 Middleborough Rd 
Box Hill, Victoria 3128 
Tel: 03 8900970 
FAX: 03 8990819 


BRAZIL 

Microlinear 

Largo do Arouche, 24 
01219 Sao Paulo, SP 
Tel: 5511-220-2215 
FAX: 5511-220-5750 


CHILE 

Sisteco 

Vecinal 40 -Las Condes 
Santiago 

Tel: 562-234-1644 
FAX: 562-233-9895 


CHINA/HONG KONG 

Novel Precision Machinery Co., Ltd. 

Room 728 Trade Square 

681 Cheung Sha Wan Road 

Kowloon, Hong Kong 

Tel: (852) 360-8999 

TWX: 32032 NVTNL HX 

FAX: (852) 725-3695 


GUATEMALA 

Abinitio 

11 Calle 2 -Zona 9 
Guatemala City 
Tel: 5022-32-4104 
FAX: 5022-32-4123 


INDIA 

Micronic Devices 
Arun Complex 
No. 65 D.V.G. Road 
Basavanagudi 
Bangalore 560 004 
Tel: 011-91-812-600-631 
011-91-812-611-365 
TLX: 9538458332 MDBG 

Micronic Devices 

No. 516 5th Floor 

Swastik Chambers 

Sion, Trombay Road 

Chembur 

Bombay 400 071 

TLX: 9531 171447 MDEV 

Micronic Devices 
25/8, 1st Floor 
Bada Bazaar Marg 
Old Rajinder Nagar 
New Delhi 110 060 
Tel: 011-91-11-5723509 
011-91-11-589771 
TLX: 031-63253 MDND IN 

Micronic Devices 

6-3-348/1 2A Dwarakapuri Colony 

Hyderabad 500 482 

Tel: 011-91-842-226748 

S&S Corporation 
1587 Kooser Road 
San Jose, CA 95118 
Tel: (408) 978-6216 
TLX: 820281 
FAX: (408) 978-8635 


JAMAICA 

MC Systems 
10-12 Grenada Crescent 
Kingston 5 
Tel: (809) 929-2638 
(809) 926-0188 
FAX: (809) 926-0104 


JAPAN 

Asahi Electronics Co. Ltd. 
KMM Bldg. 2-14-1 Asano 
Kokurakita-ku 
Kitakyushu-shi 802 
Tel: 093-511-6471 
FAX: 093-551-7861 


CTC Components Systems Co., Ltd. 
4-8-1 Dobashi, Miyamae-ku 
Kawasaki-shi, Kanagawa 213 
Tel: 044-852-5121 
FAX- 044-877-4268 

Dia Semicon Systems, Inc. 

Flower Hill Shinmachi Higashi-kan 

1- 23 Shinmachi, Setagaya-ku 
Tokyo 154 

Tel: 03-3439-1600 
FAX: 03-3439-1601 

Okaya Koki 

2- 4-18 Sakae 
Naka-ku, Nagoya-shi 460 
Tel: 052-204-8315 

FAX: 052-204-8380 
Ryoyo Electro Corp. 

Konwa Bldg. 

1-12-22 Tsukiji 
Chuo-ku, Tokyo 104 
Tel: 03-3546-501 1 
FAX: 03-3546-5044 

KOREA 

J-Tek Corporation 

Dong Sung Bldg. 9/F 

158-24, Samsung-Dong, Kangnam-Ku 

Seoul 135-090 

Tel: (822) 557-8039 

FAX: (822) 557-8304 

Samsung Electronics 
Samsung Mam Bldg. 

150 Taepyung-Ro-2KA, Chung-Ku 

Seoul 100-102 

C.P.O. Box 8780 

Tel: (822) 751-3680 

TWX: KORSST K 27970 

FAX: (822) 753-9065 

MEXICO 

PSI S.A. de C.V. 

Fco. Villa esq. Ajusco s/n 
Cuernavaca. MOR 62130 
Tel: 52-73-13-9412 
52-73-17-5340 
FAX: 52-73-17-5333 

NEW ZEALAND 
Email Electronics 
36 Olive Road 
Penrose, Auckland 
Tel- 011-64-9-591-155 
FAX: 011-64-9-592-681 


SAUDI ARABIA 

/\AE Systems, Inc 
642 N. Pastoria Ave. 

Sunnyvale, CA 94086 
U.S.A. 

Tel: (408) 732-1710 
FAX: (408) 732-3095 
TLX: 494-3405 AAE SYS 

SINGAPORE 

Electronic Resources Pte, Ltd. 

17 Harvey Road 
#03-01 Singapore 1336 
Tel: (65) 283-0888 
TWX: RS 56541 ERS 
FAX: (65) 289-5327 

SOUTH AFRICA 

Electronic Building Elements 

178 Erasmus St (off Watermeyet St.) 

Meyerspark, Pretoria, 0184 

Tel: 011-2712-803-7680 

FAX: 011-2712-803-8294 

TAIWAN 

Micro Electronics Corporation 
12th Floor, Section 3 
285 Nanking East Road 
Taipei, R.O.C. 

Tel: (886) 2-7198419 
FAX: (886) 2-7197916 

Acer Sertek Inc. 

15th Floor, Section 2 
Chien Kuo North Rd. 

Taipei 18479 R.O.C. 

Tel: 886-2-501-0055 
TWX: 23756 SERTEK 
FAX: (886) 2-5012521 

URUGUAY 

Interfase 
Zabala 1378 
11000 Montevideo 
Tel: 5982-96-0490 
5982-96-1143 
FAX: 5982-96-2965 

VENEZUELA 

Unixel C.A. 

4 Transversal de Monte Cristo 
Edf. AXXA, Piso 1, of. 1&2 
Centro Empresarial Boleita 
Caracas 

Tel: 582-238-6082 
FAX: 582-238-1816 


'Field Application Location 
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ALASKA 

Intel Corp. 

c/o TransAlaska Network 
1515 Lore Rd 
Anchorage 99507 
Tel: (907) 522-1776 

Intel Corp. 

c/o TransAlaska Data Systems 
c/o GCI Operations 
520 Fifth Ave., Suite 407 
Fairbanks 99701 
Tel: (907) 452-6264 

ARIZONA 

‘Intel Corp. 

410 North 44th Street 
Suite 500 
Phoenix 85008 
Tel: (602) 231-0386 
FAX: (602) 244-0446 

‘Intel Corp. 

500 E. Fry Blvd., Suite M-15 
Sierra Vista 85635 
Tel: (602) 459-5010 

ARKANSAS 

Intel Corp. 
c/o Federal Express 
1 500 West Park Drive 
Little Rock 72204 

CALIFORNIA 

‘Intel Corp. 

21515 Vanowen St., Ste. 116 
Canoga Park 91303 
Tel: (818) 704-8500 

‘Intel Corp. 

300 N. Continental Blvd. 

Suite 100 
El Segundo 90245 
Tel: (213) 640-6040 

‘Intel Corp. 

1900 Prairie City Rd. 

Folsom 95630-9597 
Tel; (916) 351-6143 

‘Intel Corp. 

9665 Chesapeake Dr., Suite 325 
San Diego 92123 
Tel: (619) 292-8086 

“Intel Corp. 

400 N. Tustin Avenue 
Suite 450 
Santa Ana 92705 
Tel: (714) 835-9642 

“Intel Corp. 

2700 San Tomas Exp., 1st Floor 
Santa Clara 95051 
Tel: (408) 970-1747 

COLORADO 

‘Intel Corp. 

600 S. Cherry St., Suite 700 
Denver 80222 
Tel: (303) 321-8086 


ARIZONA 

2402 W. Beardsley Road 
Phoenix 85027 
Tel: (602) 869-4288 
1 -800-468-3548 


MINNESOTA 

3500 W. 80th Street 
Suite 360 

Bloomington 55431 
Tel; (612) 835-6722 


*Carry-in locations 
“Carry-in/mail-in locations 


NORTH AMERICAN SERVICE OFFICES 


CONNECTICUT 

‘Intel Corp. 

301 Lee Farm Corporate Park 
83 Wooster Heights Rd. 
Danbury 06811 
Tel: (203) 748-3130 

FLORIDA 

“Intel Corp, 

800 Fairway Dr., Suite 160 
Deerfield Beach 33441 
Tel: (305) 421-0506 
FAX: (305) 421-2444 

‘Intel Corp. 

5850 T.G. Lee Blvd., Ste. 340 
Orlando 32822 
Tel: (407) 240-8000 

GEORGIA 

‘Intel Corp. 

20 Technology Park, Suite 150 
Norcross 30092 
Tel: (404) 449-0541 

5523 Theresa Street 
Columbus 31907 

HAWAII 

“Intel Corp. 

Honolulu 96820 
Tel; (808) 847-6738 

ILLINOIS 

“tintel Corp. 

Woodfield Corp. Center III 
300 N. Martingale Rd., Ste. 400 
Schaumburg 60173 
Tel: (708) 605-8031 

INDIANA 

‘Intel Corp. 

8910 Purdue Rd., Ste. 350 
Indianapolis 46268 
Tel: (317) 875-0623 

KANSAS 

‘Intel Corp. 

10985 Cody, Suite 140 
Overland Park 66210 
Tel; (913) 345-2727 

KENTUCKY 

Intel Corp. 

133 Walton Ave., Office 1A 
Lexington 40508 
Tel: (606) 255-2957 

Intel Corp. 

896 Hillcrest Road, Apt. A 
Radcliff 40160 (Louisville) 

LOUISIANA 

Hammond 70401 
(serviced from Jackson, MS) 


MARYLAND 

“Intel Corp. 

10010 Junction Dr., Suite 200 
Annapolis Junction 20701 
Tel: (301) 206-2860 

MASSACHUSETTS 

“Intel Corp. 

Westford Corp. Center 
3 Carlisle Rd., 2nd Floor 
Westford 01886 
Tel: (508) 692-0960 

MICHIGAN 

‘Intel Corp. 

7071 Orchard Lake Rd., Ste. 100 
West Bloomfield 48322 
Tel: (313) 851-8905 

MINNESOTA 

‘Intel Corp. 

3500 W. 80th St., Suite 360 
Bloomington 55431 
Tel: (612) 835-6722 

MISSISSIPPI 

Intel Corp. 

c/o Compu-Care 

2001 Airport Road, Suite 205F 

Jackson 39208 

Tel: (601) 932-6275 

MISSOURI 

‘Intel Corp. 

3300 Rider Trail South 
Suite 170 
Earth City 63045 
Tel; (314) 291-1990 

Intel Corp. 

Route 2, Box 221 
Smithville 64089 
Tel; (913) 345-2727 

NEW JERSEY 

“Intel Corp. 

300 Sylvan Avenue 
Englewood Cliffs 07632 
Tel: (201) 567-0821 

‘Intel Corp. 

Lincroft Office Center 
125 Half Mile Road 
Red Bank 07701 
Tel: (908) 747-2233 

NEW MEXICO 

Intel Corp. 

Rio Rancho 1 
4100 Sara Road 
Rio Rancho 87124-1025 
(near Albuquerque) 

Tel: (505) 893-7000 


NEW YORK 

‘Intel Corp. 

2950 Expressway Dr. South 
Suite 130 
Islandia 11722 
Tel: (516) 231-3300 

Intel Corp. 

300 Westage Business Center 
Suite 230 
Fishkill 12524 
Tel: (914) 897-3860 

Intel Corp. 

5858 East Molloy Road 
Syracuse 13211 
Tel: (315) 454-0576 


NORTH CAROLINA 

‘Intel Corp. 

5800 Executive Center Drive 
Suite 105 
Charlotte 28212 
Tel: (704) 568-8966 

“Intel Corp. 

5540 Centerview Dr., Suite 215 
Raleigh 27606 
Tel- (919) 851-9537 


OHIO 

“Intel Corp. 

3401 Park Center Dr., Ste. 220 
Dayton 45414 
Tel: (513) 890-5350 

‘Intel Corp. 

25700 Science Park Dr., Ste. 100 
Beach wood 44122 
Tel; (216) 464-2736 


OREGON 

“Intel Corp. 

15254 N.W. Greenbrier Pkwy. 
Building B 
Beaverton 97006 
Tel: (503) 645-8051 


PENNSYLVANIA 

‘tintel Corp. 

925 Harvest Drive 
Suite 200 
Blue Bell 19422 
Tel- (215) 641-1000 
1-800-468-3548 
FAX; (215) 641-0785 

“tintel Corp. 

400 Penn Center Blvd., Ste. 610 
Pittsburgh 15235 
Tel: (412) 823-4970 

‘Intel Corp. 

1513 Cedar Cliff Dr. 

Camp Hill 17011 
Tel: (717) 761-0860 


PUERTO RICO 

Intel Corp. 

South Industrial Park 
P.O. Box 910 
Las Piedras 00671 
Tel: (809) 733-8616 

TEXAS 

“Intel Corp. 

Westech 360, Suite 4230 
8911 N, Capitol of Texas Hwy. 
Austin 78752-1239 
Tel : (512) 794-8086 

“tintel Corp. 

12000 Ford Rd., Suite 401 

Dallas 75234 

Tel- (214) 241-8087 

“Intel Corp. 

7322 SW Freeway, Suite 1490 
Houston 77074 
Tel: (713) 988-8086 

UTAH 

Intel Corp. 

428 East 6400 South 
Suite 104 
Murray 84107 
Tel: (801) 263-8051 
FAX: (801) 268-1457 

VIRGINIA 

‘Intel Corp 

9030 Stony Point Pkwy. 

Suite 360 
Richmond 23235 
Tel: (804) 330-9393 

WASHINGTON 

“Intel Corp. 

155 108th Avenue N.E., Ste. 386 
Bellevue 98004 
Tel: (206) 453-8086 

CANADA 

ONTARIO 

“Intel Semiconductor of 
Canada, Ltd. 

2650 Queensview Dr., Ste. 250 
Ottawa K2B 8H6 
Tel: (613) 829-9714 
“Intel Semiconductor of 
Canada, Ltd 
190 Attwell Dr , Ste 102 
Rexdale (Toronto) M9W 6H8 
Tel- (416) 675-2105 

QUEBEC 

“Intel Semiconductor of 
Canada, Ltd 
1 Rue Holiday 
Suite 115 
Tour East 
Pt. Claire H9R 5N3 
Tel: (514) 694-9130 
FAX: 514-694-0064 


CUSTOMER TRAINING CENTERS 


SYSTEMS ENGINEERING OFFICES 


NEW YORK 

2950 Expressway Dr., South 
Islandia 11722 
Tel: (506) 231-3300 
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Multimedia and 
Supercomputing Processors 
Intel Corporation’s Multimedia and 
Supercomputing Components Group products 
enrich computerized information and exchange 
technologies in imaginative new ways never 
before possible. To learn more about Intel’s 
problem-solving MSCG products: The i750' 
video processor, and the i860^^ and i960^'^' 
microprocessor families, you will want to read 
this publication. 
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